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SUMMARY 

The entire DNA sequence of varicella-zoster vims (VZV) was determined using the 

IS'^SlK^Sl VZV gents to be deduced although "ed regton, 
where .he genomes differ in functional organization »«re also identtSed. 

INTRODUCTION 

Mostpeoplecontractchi^^ 
symptoms of shingles. Both diseases are cause d .^.™SrKS£Tu a consequence 
(VZV): chickenpox as a result of generahzed V^^g* J? l!fcS5e of the individual, 
of reactivation of virus which has remain ^52j£j*S^Sc««of efther diieue. 
There are no generally available measures for the effective P r « venuo " knowledge of the 
Despite such a motivation for studying this medically ,m P° rtan V^™^7 r ot her four 
modular biology of VZV is rudimentary ^ l^Ss^l 

herpesviruses which infect humans: ^^^"SJ^S^ is that the problems 

^fX^WA^lc^ first shown to have a G + C content of 46% by Ludwig et ai 
,981), who correctly ^ermined the^^ 

reported the genome structure of ^^^^^STldie, and the derivation of 

aS^^SS tot-rel emerged from initial DNA sequencing studies (Davison, .983, 
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1984; Davison & Scott, 1985). In summary, the VZV genome is a linear double-stranded DNA 
molecule consisting of two covalently joined segments, L and S. L comprises an unique sequence 
(U L ; approx. 100000 bp) flanked by a small inverted repeat (TR L and IR L ; 88*5 bp). S contains 
an unique sequence (U s ; 5232 bp) flanked by a large inverted repeat (TR S and IR S ; 7319*5 bp). 
The genome is not terminally redundant, and possesses an unpaired C residue at the 3' end of L 
and an unpaired G residue at the 3' end of S. Virion DNA contains two major and two minor 
genome arrangements differing in the relative orientations of the L and S segments; whereas one 
orientation of the S segment is present in 50% of virion DNA molecules and the other in the 
remaining 50%, one orientation of the L segment is present in approximately 95% of molecules 
and the other in only 5%. It has been reported that a small proportion of virions contains 
superhelical circular DNA molecules (Straus et al., 1981; Kinchington et aL, 1985). 

As a culmination of our own studies, the VZV DNA sequence is presented in this paper. Thus, 
VZV is the first member of the Alphaherpesvirinae whose genome has been completely 
characterized at this level. The usefulness of comparisons between VZV and HSV-1, also a 
member of the Alphaherpesvirinae, became apparent when Davison & Wilkie (1983) observed 
from DNA hybridization experiments that these viruses possess several conserved genes 
arranged colinearly in the genomes. The hypothesis resulting from this finding, that the two 
genomes have similar gene arrangements, was confirmed by comparison of the VZV gene layout 
deduced from the DNA sequence with that of HSV-1 proposed from currently available 
transcript mapping and DNA sequence data. Consequently, the functions of several VZV genes 
can be identified on the basis of our knowledge of the molecular genetics of HSV-1, which far 
exceeds that of any other herpesvirus. 

METHODS 

Recombinant plasmids. Plasmids comprising Kpnl or Sstl fragments of VZV DNA inserted into the Pstl site of 
vector p AT 153 have been described previously (Davison & Scott, 1983). Additional plasmids consisting of 
//mdlll or EcoRl fragments of VZV DNA inserted into the ///ndlll site of direct selection vector pTR262 
(Roberts et al, 1980) or the EcoRl site of vector pUC9 (Vieira & Messing, 1 982), respectively, were characterized 
on the basis of published //mdlll and £coRI maps (Straus et aL, 1982; Ecker & Hyman, 1982; Mishraef a/., 1984). 
For SI nuclease analysis of the mRN A encoding deoxypyrimidine kinase, VZV Pstl o was subcloned from the 
plasmid containing Hindlll b plus / into the Pstl site of vector pUC8 (Vieira & Messing, 1982). All plasmids were 
propagated in Escherichia colt K12 strain DH1 (Hanahan, 1983). 

DNA sequencing. DNA sequences were obtained using the M 1 3-d ideoxy nucleotide technology (Sanger et aL, 
1980). Plasmid DNA was sonicated and then precipitated using polyethylene glycol to give random fragments 400 
to 1500 bp in size. The sheared ends of the fragments were repaired using T4 DNA polymerase in the presence of 
the four deoxyribonucleoside triphosphates. The fragments were then ligated into the replicative form of vector 
M13mp8 (Messing & Vieira, 1982) which had been linearized using Smal and treated with bacterial alkaline 
phosphatase. Ligated DNA was transfected into E. coli K12 strain JM10I (Messing, 1979), and clones bearing 
inserts were identified as clear plaques in a bacterial lawn using isopropyl /?-r>thiogalactopyranoside, an inducer 
of the lac operon, and 5-bromo-4-chloro-3-indolyl-)5-D-galactopyranoside, a histochemical substrate for /f- 
galactosidase. Recombinant bacteriophage DNA was prepared under conditions of good microbiological 
practice, and clones for sequencing were identified by hybridization of the appropriate nick-translated VZV 
restriction fragment. Clones were sequenced using pentadecamer primer (New England Biolabs), [a- 32 P]dATP 
(PB .10204; Amersham) and the Klenow fragment of DNA polymerase I. The latter was obtained from Boehringer 
Mannheim in earlier stages of the work, but for the majority of sequencing the Klenow fragment was purified from 
the genetically engineered strain of E. coli described by Joyce & Grindley (1983). Sequencing products were 
separated in 0*35 mm 6% polyacrylamide-urea gels containing a buffer gradient (Biggin et aL, 1983). Each gel was 
bonded to one glass plate prior to electrophoresis (GaroflT & Ansorge, 1981), and then dried prior to 
autoradiography. 

Plasmids containing the following VZV DNA fragments were sequenced in their entirety: Kpnl t, Kpnl c, 
HindUl a, //mdl II e, Hindlll </, Hin6\\\ b plus /, Kpnl b t Kpnl i, Kpnl /, Sstl g and Sstl f. Junctions between 
fragments were established by sequencing specific restriction fragments from additional plasmids. 

Data handling and analysis. DNA sequences from individual Ml 3 clones were read using a Summagraphics 
digitizer pad with a program written by P. Taylor, and were compiled and analysed using the programs of Staden 
(1982) modified by P. Taylor for a DEC PDP-1 1/44 computer operating under the RSX 1 1 M system. Open reading 
frames were identified using the program of Blumenthal et aL (1982), and translated into amino acid sequences 
using a program devised by Taylor (1986). Codon usage was examined using the program of Staden & McLachlan 
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(1982). Sequence homologies were analysed using the matrix comparison of Pustell & Kafatos (1982) and the 
optimal alignment program of Taylor (1984). Hydrophobicity profiles were prepared using the parameters 
described by Kyte & Doolittle (1982). A search of the Protein Identification Resource (release 5) compiled by the 
National Biomedical Research Foundation (NBRF) for homologues of VZV proteins was carried out using the 
WORDSEARCH program of Devereux et ai (1984) in a DEC VAX1 1/750 computer operated by Edinburgh 
Regional Computing Centre. 

S 1 nuclease analysis of VZ V deoxypyrimidine kinase mRNA . Cytoplasmic RN A was prepared by the method of 
Kumar & Lindberg (1972) from uninfected human foetal lung cells or from VZV-infected cells showing 50% c.p.e. 
Structural analysis of the deoxypyrimidine kinase mRNA was performed using the SI nuclease digestion 
procedure of Berk & Sharp ( 1 978) modified by using 5' or 3' end-labelled probes. DN A/RN A hybridization and S 1 
nuclease digestion were carried out as described by Rixon & Clements (1982), except that 15 ug RNA was 
hybridized at 45 °C to less than 1 ug end-labelled DNA fragment isolated from the Pst\ o plasmid. The digestion 
products were separated on DNA sequencing gels and detected by autoradiography. 



RESULTS AND DISCUSSION 

VZV genome size 

The entire VZV DNA sequence is shown in Fig. 1. It was derived from approximately 1-2 x 
10 6 nucleotides of data, and about 97% of the sequence was determined on both strands. The 
genome size, from the plasmids analysed, is 124884 bp, and the G + C content is 46-02%, 
impressively close to the value of 46% derived from buoyant density centrifugation by Ludwig et 
at (1972). The sizes and G + C contents of components of the genome are as follows: U L) 
104836 bp, 44-33% G + C; TR L and IR L , each 88-5 bp, 68-36% G + C;U S , 5232 bp, 42-78% G 
+ C;TR S and IR S , each 7319-5 bp, 59*04% G + C. The significantly higher G + C content of 
the inverted repeats is a general feature of herpesvirus genomes, and has been discussed 
previously for VZV (Davison & Scott, 1985). 

Five regions of the genome contain tandem direct reiterations of short G + C-rich sequences. 
One, in TR S , is a duplicate of that in IR S , and so the four unique reiterations are denoted Rl to 
R4 in Fig. 1 and 2. Regions of the genome which vary in size between different virus isolates 
have been mapped by Straus et ai (1983), and correspond approximately to the locations of 
R2, R3 and R4. R4 (109762 to 109907 and 119990 to 120135 in Fig. 1) has the structure 
AAA A AX, where A is a 27 bp element and X is a partial copy of 1 1 bp of A. Casey et al (1 985) 
reported that the copy number of the 27 bp element varies between virus isolates. 

The region containing R3 (41 453 to41 519 in Fig. 1) is the most variable in size between virus 
isolates (Straus et ai, 1983). Moreover, fragments containing R3 are particularly difficult to 
clone in E. coli, and those cloned fragments which are obtained may differ significantly in size 
from the virion DNA fragment (Straus et a!. t 1982). Thus, the R3-containing clone which was 
sequenced (HindUl e) is smaller by about 1000 bp than the estimated size of HindUl e cleaved 
from virion DNA. In this clone, R3 has the structure AAAAABAX, where A and B are 
unrelated 9 bp elements and X is a partial copy of 4 bp of A. Preliminary analysis of an 
independent clone of HindUl e which is about 500 bp larger showed that the additional sequence 
is contained within R3, and that the reiteration contains a complex arrangement of 9 bp 
elements, including one not present in the smaller clone (data not shown). The sequencing 
results, and the discrepancy in size between virion and cloned DNA fragments, imply that R3 
may contain in excess of 100 copies of the 9 bp elements in virion DNA. Presumably, this highly 
repetitive structure is unstable in E. coli, so that stable clones are obtained rarely and lack many 
of the 9 bp elements. 

In the Dumas strain, R2 (20692 to 21 017 in Fig. I) has the structure ABABAAAX, where A 
and B are 42 bp elements differing in a single base pair, and X is a partial copy of 32 bp of B. 
Again, variation in the copy number of the 42 bp elements results in size heterogeneity in this 
region of the genome in different isolates (P. Kinchington & J. Hay, personal communication). 
Rl (13937 to 14242 in Fig. 1) is a rather complex reiteration containing four elements: A, B, C 
and D. A and C are 18 bp in size and differ in a single base pair, B is 15 bp in size and unrelated 
to A or C, and D is 1 5 bp in size and consists of the first 6 (or 7) bp of A or C linked to the last 9 (or 
8) bp of B. Rl also contains a partial copy (X) of 3 bp of A, B or D. The sequence of Rl was 
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TRL x ul 

«GCCAGCCCTCTCCC«mW^^^ ^ 

A&CTTTAACAAAACCCGCGCCnTTGCCTCCACCCCTCGnTACTGCTC^ 240 

6GATGACGTTGCGACCCCCATCCCCTACCTACCCACATACGCCGA6GCCGTGGCAGACGCGCCCCCCCCTTACAGAAGCCGC6AGAGTCT^ 360 

GGAGAATGGCACCACCCAACAGTCTTACGATTGCCTAGACTGCKTTATGATGGAATCCACAGACTTCAGCTGGCTTTTCTA 430 

^ TTTTGGT ATTC TCACCCTTACTGC T6TCGT6GTCGCCATTG TTGCCGFTTTTCCCGAGGAACCTCCC AACTC AAC TAC ATGA AACT AC TGTCCGGAAGGGG A AG6T ATTT A TTC TC6CTT 600 

• - E ft t 106 

GCAGCTTGTCGCGCGTGTATGCACAACAAAAGCTATATATGTCACCAAAGCCAACGTCGCCATCTGG^^ m 

T H Y Y F A I Y T YLALTAMQL Y Y 6 L VNCLRDUQMCLQA 66 

ggacccctttctccgggatcgt«h:cttgggacatcaaccagt6gaataa6aaccgccw mo 

S A r R R S R P R P V 0 V L P I L ¥ A P P R T 0 V V L P S A M H 0 « L E T H Q Y 26 

C ATA A AC AC TGT A TCGCTATC AG A TTCCCGA AC ACCTTCCGGT ACCCC A T AC TCC G AT ACCCTGGACA TTGCGG A TCCCAAA AATATAAT ATTAACA 960 

NFVSDTDSOSERVGEPVGYESVRSH t 

GCTACAGCTTATATAAATTTATGTGCGATACATCTTAAGTGCATCCGTACGTTATTTATACATTGCCTGTCACGTGAAAAGACTGTGTTACCCAATAAAGGTTCTACAAU 1080 

2 MHVISETLAY6HVPAFIMGSTL 22 
TTGGGTGTTTGTTTAATAGCTATTATCGTAACCCACCCCCGTAAAATCATAAAATGCAT6TAATTTCTGAGACACTTGCATATGGGCATGTTCCC6CATTTATTATGGGCTCCACTCT6G 1200 

YRPSLNATAEENPASETRCLLRVLAGRTV0LPG66TLHIT 62 

TGCGTaCAGTTTAAACGCCACCGCCGAGGAAAATCCCGCGTCAGAAACGCGATGTTTATTACGAGTGCTTGCGGGGAGAACTGTAGACCTGC^ 1320 

CTKTYVI IGKYSKPGERLSLARLIGRAMTPGGARTFI ILA 102 

GTACCAAAACCTATGTAATTATTGGCAAATATAGCAAACCCGGCGAACGTCTTAGCCTTGCCCGTCTAnAGGGCGTGCAATGACGCCTGGAGGTGCUGGACATTTATT 1440 

MKEKRSTTLGYECGT6LHLLAPSMGTFLRTHGLSNRDLCL 142 

TGAAGGAAAAGCGATCCACAACGCTTGGGTATGAATGTGGTACGGGCTTGCnTTACTGGCTCCATCTATGGGTACAnTCTCCGCACACACGGTTTAAGTAACAGAGAK 1560 

« R G H I Y 0 M H M 0 R L M F « E N I A Q H T T E T P C I T S T L T C H L T E 0 182 

GGCGGGGTAATATTTATGATATGCATATKAACGTCTTATGTTTTGGG^ ^ 

S G E A A L T T S 0 R P T L P T L T A Q G R P T V S H I R G I L K G S P R Q Q P 222 

CTGGTGAAGCCGCACTTACCACGTCAGACCGACCCACTCTCCCAACCCTAACAGCCCAAGGAAGACCAACAGTTTCCAACATTC6TGGAATATTGAAAGGATCCCCCCGTCAACAGCCGG 1800 

VCHRVRFAEPTE6YLM- • 2M 

TCTGTCACCGGGnAGATTTGCCGAACCTACGGAGGGCGTATTGATGTAATCACTAAATAAUTACACCTTTTTTCGATT6TAC6TATTTTTATTTAAATGTGTAGTTCATAGICCGCCG 1920 

- I G C 177 

ACAGCCGCTCGGGCTTTTCCCCCACATACAACATGATCGTATGCCTCGGATGCACCGGTCCAACACTCCGCCGAGAAGGGGGATTTACAATGACAGTGATACCCAATAGCCW 2040 

V A A R A K G G C V Y H 0 Y A E S A G T W C E A S F P S - r C H C H Y 6 I A A L H 137 

ACACCCAGCTGTCCGGACTCCAGCATCATCTGCTGAGTTGCGCCGCTGAAGGGTGCATCGCATAGGCTGTTATAATTAGCCATTTCCGG 2160 

VGLQGSELHMQQTAASFPAOCLTNYNAHEPLLRQSNLLSC 97 

AACGGCTGTAGGTCAACATACATTGGGGATTCAGATGGTTTATCTCGAC6TCCAAGTCCAATCAAAAAAGCGTGTUATCATCAGCCC6GCCGCATGTTGCTCGAAGAK 2280 

F P Q L 0 V Y H P S E S P I 0 R R 6 L G I L F A H L 0 0 A R G C T A R L A C L fl 57 

""CACCGTACAGAGGGOATGGCGTCGGTGCATG^ 2400 

r V G V L P S P T P A H S N A P C T % T T E L A L P P I V Q T F S A L P D P I I 17 

nCACTCCGATGGGnGACTOT^ ^ 

NYRIPQSSESA6TTDM ^ 

GAAAGCAATCTAATCCCGCCCATATATCCCCAACGTCGCCTTAWAACTCCCACAATATTACATTTTTATTAGTCTTTTATTAATATAGAATCACATAAACA 2640 

TGGTGTATAATGAnAAAAATATAAATTGATATGTTHACAAKATGAAATAGGW 2760 

CATGAAAAAGGTATTTTTTATnTAGCAGTTAAAGGTACTACACTTAAAATATTTACCGTATGGACGGGCGTCAGAAAGATGCCCGGCCCAAGTTCAGAGGGTACATTCAACA 2880 

4 * - CNFTSCItFYK6YPRADSLHGAiTSLTCEVR6C 421 

ACTC6C6TTGGTGGGTGATTAGGGCCTCTAAAACACCGGCCAGACATGACCCGGGTGTATATTCTTGTAACACTTGAACGTTACAACTGATATCATCATATTCCACAAATTT 3000 

E R Q H T I L A E L V G A L C S G P T Y E 0 L Y Q Y N C S I 0 0 Y E Y F t S 6 R 381 

6GACAACTATATTAGCAATGCGGGCAATCATAACAAACATATAAGTAGTAATACACGTGATATCACTAAAACGTTGCTGGCGCAACAGTTCGGGGAGAGTACGAGACCCCAAATCGTTGT 3120 

VVINAIRAIMYFMYTTICTIDSFRQQRLLEPLTRSGLDND 341 

CCCTGTTTAGAAGAAGACATCTTACAAAAGGCCCCAGCTTTAACnTAAATTCTCCAAUGTGACTTCGAGGTTGCAACAATGGGATTATTTGTGTAGATGGGCAAGTTTTTT6CCGCTA 3240 

R N L L L C R Y F P 6 L I I t L N E I I S I S T A Y I P N N T Y I P L N I A A L 301 
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ICAtnTAATCCACGTUACAGTTCATCCGCAGACTCCAACGCnCAATCAAAGATTCT^ 3360 

U t I » T L L E 0 A S E L A E I L S E 6 R I V R E R L A R A I R S 0 M K Y S £ F 261 

IGGTACGATAAAGTTCATGTCCGTACAACATCAACTCC^^ 3480 

TRVLEHCVLMLEPiST«QCIGPFCG6SAiSOGE0ITAVRV 221 

CTACCAGTnTTCAATTGTTTCATCTAAATGGC6TACCGAGTCAATG6TCACGCT^ 3600 

VLCEITEDIHRYSOITYSAGATSVVEIARGTIROYPIDYE 181 

CTTnCGAATACKTCTCGGCGGGCGTCTCTCnGGAAAATCGCAACCTGT 3720 

t R 1 R E R R A 0 R t S F R L R Y S E D H T Q 0 N R E G T T H A P P T T P R R R 141 

ITCGAnTGACAGGGATC6ATCAC6GTGTffTCnGAACTn6A6TGnATAW^ 3840 

R N S I S R 0 R H t R S S 0 T N Y S R S S R R H 6 R E I R A Y G T E V 0 R G G H 101 

6ATGGTTTGAATC6GGTAATACUCAACCAAAGTTTTCGG6C6^ 3960 

H N S 0 P L Y V V L T I P R « H H Y S E R R 6 H R R I S Y R P N Y A T D A T R E 61 

CCACATACACAGTTCTAGAC6nGT6GAGTCCTC6CCT66AGTG6A6CCAATAGOT 4080 

VYVTRSTTSOEGPTSGIAEONAiOTYELAFOOSPEOPLLN 21 

TCATAAAGTCTTCACAAATAGTA6ACACGTCTGGGTCGGffra^ 4200 

MFDECITSVDPOTPISASAM 1 

T*GAATTTATAGGCTCACCCAACCCCGCAATGGGC6T6TTTAGTCACATGATTWTGCnCTGGGA6TTnCACTTTCCCCAAACAAGCnACCTGCA 4320 

-HKQSNESE6FISVQVRQEYHMF 319 

AATAACCACTGCTATAGCAAATATGACGATATAAAAACATTTC^ 4440 

IVVAIAFIV IV FCKIALGSMVTACCTTCI6RIGGTNIHHL 279 

ATGAnATCCTTG6TTGGTTTTGGTCTAACATAAGATATAA6CTCTACM^ 4560 

H N 0 t T P It P R V Y S I L E V I A L T C Y V V V A L I R I Y T P Y L L A C P T 239 

AT AT6C A ACGtC A AGCGTT A A A AGC AC AATACATCC A6ATGAT ATATG AGCGATA ACCTCC AA 4680 

YAV61TLLVI C*6 SSIHAIYELLMLLTYGKHMYLFSIPOAH 199 

AAATACTnACTCATACCATCttGTCGCATGGAUCATCACATAA 4800 

F Y I S M 6 0 R R M S V D C L L R A L I T Y P Y G L L I N S I V R T V Y N L S Y 159 

TGATGTGG666ATAnAACTCACA6GAT6ATC6GUTGGCCCAAACATAC6AC6T^ «M 

S T P S I L E C S S R F P G F M R R I R R F Q L Y N 6 Y V F V A F F I M It L Q Y 119 

GCAGCAAAAATAAGCGTGACAAmCGTGnCCCAGAACAAnCGAA^ 5040 

C1FYAHCHRTGLVIRIC0HLPTSIATKVHT06SLLNERCS 79 

6TAATC6TATCCAGATAGGCCATCCAAAAACGn6AGT6GTTTACAAACGnACATATATAAGAGAGTTGnATAAGACCCCCATACUCCGGTCCACCAnUTCA 5160 

TDYGSLGOLFTSHNYFTVYILSIlHYSGiYYPGGNIYTTAY 39 

CACACACTCAT6nCAAACTTTACACGAGCGGTATACCATAGG6T 5280 

YCEHEFKYRATYILTFVAH6SLLCMI IFHETK IGLAQK 1 

T6GAAGATGGCAAnCAAGCAC6ATCTAGTATCACAC6Gn6GTGnAACTC6AAGnAAATTTGGATAATTAGGTACnCTAGAG 5400 

-SSTLNPYNPVELTF ITHMRHSOCK 1060 

T6TAGCAAAACATT6TTGTGCU6CGAAATACACAAACGGnGT6ATGATCCACTCGCA6AGACACAAATGTCC6GGGAGCCGncnCCTCCGCGA 5520 

TAFCQQALSICLRNHH0VRLSVFTRPATR66RHPYRLCTF 1020 

CCCTTTTGnCC6CATAT6AGCT6AAATAACACttAGTCCC™^ 5640 

GKTGCIL0FLYI0RIIAIC)tSSTLTYER0R6'iPFHYAIHEV 980 

AATAGATTCTAATATTGTGCT6TC6ACAAAGGCCTCCAGTGTAAAT6CGTCCAGACAAGTTACCCCGCGCTCnnAGAGCCTTTGTTAAAGATATnGCG 5760 

ISELJTSOVFAELTFAOLCTVGREILAKTLSIQPSSFIQI 940 

A TT ACGCGC AACCTT AC6TTC AA A A A ACTCTGC6T ATTCCCCCCC A AGGTTAT6TA A AAT AA ATT6C ACTGG A AC ATTCG AC TGCGGTC TTG A ATGA A A AT6A AA6TTTGCCGGGTTTCT 5880 

NRAYIREFFEATE6GLNHLIFQVPVNSQPRSHFHFNAPNR 900 

AT6TGATGTCACAAACKTAATATATCAATACACTGCTCAGGTACUCATAAAATG^ 6000 

HSTVFALIOICQEPYYYFPLLQGVATGTTTVrSFFPLRLS 860 

AT6TCCGT6GCTATAAACACCAGTATCTATAAACGAAAA6TCCCGTAAATACG6A 6120 

H6HSYVGT0IFSF0RLYP6IYEVFERELLYAQQLIQAFGK 820 

TAAAGTG6AAGACCCCACTAAC6CATAGGGAm6GGAnGGTACGCATACCCTGAAACCTAnnCTCTnACA6nACAGGGTA6AGmCATGCA 6240 

LTSS6VLAYPHP1PYCVRFGIKEKCHCPLTEHLNEMNSV0 780 
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6GCaTGT6UTGGACTTUGACGnGTCTGT6TATCU^ 6350 

AHTHVESTTQTDFFIVDETYNEEVCTYLE6NEIYFD1D$H 740 

6CT6GnATATCCAATAAATTATCATCATCCAACACCTCAACG6TAGGTTCA^ 6480 

STIDLLNOODLVEVTPEPCATKYFYCPDKNPNVVACPFLl 700 

nGCATGGCC6nAAAATACCATGACGAAATGCTCGCATGCCGGCATGTAAAATACCCAATGGGATGGGTTnCTTATUGAAAGTCTACATCAA6 6600 

QMATLIGHRFARHGANLIGLPIPKRIHFDVOLILNTIILN 660 

TGTAffAAATAGCTCATTCCTGnTATATAAAGCTGATCTn6&^ 6720 

TNFLENRNIYLQDKPINSS1K5VNKVSTSLLLTLQHDRLS 620 

AGC6GCAACAUAnACATGGATTAATTTGTTTAA6GTCCTCCG€AAYTUTCGAGCCTCGTGCGGTAAAGTGTAAC6GTU^ 6840 

AAVFNCPNXQ, KLOEAILRAEHPLTYRNTISSNTONAIVAF S80 

T6CnG6GC6CCGTGAGGCAA6GCTACCCGATATACAGGCATT(H»TO 6960 

A Q A 6 H P L A V R Y V P H P 6 T V E S H G I L A E L P T I Y S L I S ¥ D H T A 540 

TATCCCA6TGGCAGCAGAGAAAUCAGTAATAGTTTTGTAATCCCCGG& 7080 

IGTAASFFLLLKTI6PSTDFGTRGSQNPTDNAFNAARIAE 500 

CGCGGAAACACCCGAATCCTCAAAATTAGACAATTC6TCAAAACCGGGT^ 720 o 

ASVGSDEFNSLE0F6PPN5PITSSVGYPSFQKEINEVRRT 460 

T A6CGTTGTAGCTAG6TCACA T AC6CCT A T AA AC TTGCT AGGT TTTGCGGC AT AC6T A AGACTT A A ACT AT ATGTT TT ACT AA TTGT AT A TT T ATGTCCAAICTCAG6TCC AAGT TCAGT 7320 

LTTALDCV6IFKSPKAAYTL5LTYTKTITYKHG1EP6LET 420 

GACATCACAAATTACGffCnTnTATATAGTCACGCATGTTGAGAC 7440 

VDCIYNKKIYDRMNLRSRVHNFFNATARK6LNSSKLLVP. K 380 

ATTCACAAAATCTGAGTATGTAACCGCTTGTAGGTGGrCTGCGATCTGTTTCCGATTGAAACATTCAAAATGTGCCAGATAAATATAATCAACAUnCAC^ 7560 

NVFDSYTVAQLHOAXQKRNFCEFHAIYIYOVFERDPYKLG 340 

TTTTCTATCGTTGGT AAT AT ACTCCG ATACTGCGTGT A T T TCCGTTGTGTCTGT ATGTATTCGC TGTA AA ATGT ACG ATAGAGC A TTTT TGGCTGTCA AACCTCGTGTAT A TGTTGAGGA 7680 

KRDNT1YESVAHIETT0THIRQLIYSLANXATLGRTYTSS 300 

ACAACAAAACATGGAAAGTTTATCAAAAGACAACAAGTCCGAAATATTGTACCCACTACAATTAGGTAATGCCGGGACTTGGTAAGnAAAAACAAATCTTTAATTGCCTGTAAGTCATA 7800 

CCFMSLKDFSLLOSINYGSCNPLAPVQYTLFLOKIAQLOY 260 

TAAGGGGGTTTCCA ACGT ATTGT AAC TTGT6TCCGTTTGT A AC AA6T AAT AGCGT6TAGCC A AC ACT AGCGTT TT T TC AG AGGCTCC AAATC6AAC AAT ATACCAA AACGGC6AGC A TCC 7920 

LPTELTNYSTDTQLLYYRTALVLTKESPGFRV1Y1FPSCG 220 

ATACCC«AGTAGAGTCGTCGATATGCAGCCAATACTTGACGTTCGTUTGGGCATATAATGATGTTAGCTCCTGACGACCAAC6GATTTmMCTAACTT6CAGAGTGT 8040 

YGIYLRRYAALYQREYHAYLSTLEQRGV5KKVLKCLTAET 180 

GATGCATAGGCC6nGTCCGATAATCCCnTCGGnTAAATGGT6TGTTGTTACCATCAGAGTTTGTATAACTTCCGAGTGAATGTCAAACGTCTCCGATATACA 8160 

ICLGN0SL6KRNLHHTTVMLTQIVESHI0FTESICLT0SI 140 

TATATGCGGATTTAGGGGTGtfCCATACCATAACGCCTTATATAAAGCTnA 8280 

IHPNLPA6YNLAKYLAKFDTQTKFCCFFIPIVRSRVD6TS 100 

AAATCCACCAATTAAATAAAAAATAACGTTGACGTCCCTACTACAAAATAAATGCATTATTTGGTTTTCTTCATCGTTTTCAGTTACnCACGTGGGC6nTAGTTGGGATTACTTGCGT 8400 

FGGILYFIVNVORSCFLKMIQNEEONETVERPRKTPIVQT 60 

GATCTCnCCCTCCCATTTTTGACAAAGACGTCATCTAAGTC^ 8520 

IEERGNKVFVD0LDPTHTYSVVYLPETSIQ6TLLLLSHSI 20 

MQTVCASLCGY 11 

TGCACATCCCTnGTGGCAAATAATAACCGAATCGTC6GTTTGGAGGATTTATCCATAGTTCUTACGTTGGAAAGCCAGTCAATCATGCAGACGGTGTGTGCCA 8640 

AC6KTAFLLRITPKSSK0M 1 

A R I P T E E P S Y E E V R V N T H P Q G A A L I R L Q E A L T A V N 6 I I P A 51 

CTC6AATACCAACTGAAGAGCCATCTTATGAAGAWiTGCGT6TA^ 8760 

PLTLEOVVASAONTRRLVRAOALARTYAACSRNIECLKQH 91 

CTCTUCGTTAGAAGACGTAGTCGCTTCTGCA0ATAATACCCGTC6TTT6GTCC6CGCCCAGGCTTTGGCGCCAACTTACGCTGCAT6TK 8880 

HFTEDNPGLNAVYRSHMENSKRLADMCLAAITHLYLSVGA 131 

ATTTTACTGAAGATAACCCCGGTCTTAACGCCGTGGTCCGTTCACACATGGAAAACTC^ 9000 

VOVTTODIYDQTLRMTAESEVVMSOVVLIECTLGVVAKPQ 171 

TGGA TGTTACT ACGG ATGATATTGTCGATCAAACCCTGAGA ATGACCGCTG A AAGTGA AGTGGTC A TGTC TG A TGTTGT TCTTT TGGAGA A AACTCTTGGGGTCGTTGCT AAACCTC AGG 9120 



ASFDYSHNHELSIAKGENVGLKTSPIXSEATQLSEIKPPL 211 
CATCGTTTGATGTTTCCCACAACCATGAATTATCTATAGXTAAAGGGGAAAATGTGGGTTTAAAAACATCACCTATTAAATCGGAGGCGACACAATTATCTGAAATTAAACCCCCACTTA 9240 



VZV DNA sequence 



1765 



IE'vSDNNTSNLTKKTYFTETLQFVLTPKQTQDVQRTTFAI 251 

TAGU6TATCGCATAATAACACATCTAAttTUCUAUAAAC6TATCC6ACA6AAACTCTTCAI^ 9360 

trSHVMLV- * 2S9 

AGAAATCCCATGTTATGCnGTATUATAnGAAATAAAAACTAAAAACGTnCTGGTGTATC^ 9480 



TGTTTTAGTAGAAAATCGACATCGmGTnCTTTATCAGTTGAACCAAATCCACGCCnCCCCCTTCGCTGGGTGTGGCTAnAGATCTAACGTnTAGTUAATACCAnGTACACCC 9600 

HKLLFDVONTEKOTSGFGRTGRESPTAILDLTCTFYIQVG 357 

6GTATGCCACAnTACCGCG6ATAGCATAAG6AUTGCAATATTACnAAAACGnGT^^ 9720 

PIGCKGRIAYPFAINSLVNHKLHIQTNHDILLVQALRD6T 317 

mATACGTATGTCATCACCCGTGAGAnATATACGTAGAAnTACAGTGTTCTCCTGCAGGCCATGCCGTTGaCACACGATUTGaTGATCGGCmTCGATGATCnCCAAAAATA 9840 

KIRID0GTLNYVYFKCHEGAPHATPCVIIGSRSKSSR6FI 277 

TAAGCGTTTATACTCG6ATGnGTAA6TCCCAGTCTCTTATAATC 9960 

YANISPHQLDNDRIIPLVIKIFENRKLYLNYPVCIDYGAD 237 

TCTT6GCGnTTGGATTGATGATATGnTGTAGGUAAGGGAACATtf^ 10080 

EQRKPNIIHKYTLPVDIHYEASORHLPQGQQVTSIOAFEP 197 

ATAACGG6TTTTTCATAATTTGACGGC6A6TTTGATAAGGGTW 10200 

IVPKEYNSPSNSLPQVQISKFIPOLHKLVNKPLLRSQRLK 157 

ACC6GGAACAAGTAGAnGnAAAT6TCCGGGTAAAATAACG6nACTCCTGGCC^^ 10320 

YPFLYITLH6PLIYTVGPRYYLLASIVGRY6ADIY6NAYF 117 

AAATTGTCTTCATCAftAAIKKAGTATCTTTGCAnGAAm 10440 

FNDEOALATOKCQILLLAYENPPASKVLLELQQHYFGGRT 77 

ACAGATTTTTCAGATGGCAGTTCGAGTTTCTTGTGGTTCCGGAGTA^ 10560 

VSKESPLELKKHNRLILPQHRSVKOOLVCQLTOACEPQVE 37 

AnAAAATTGTATCTTTTAAACACCGATTCttAATAGTTTGGC^ 10680 

ILITOKLCRNPITQSCFHOGTNVATELIPDIVAENM 1 

CAAATUTATTTTTTTGTGAAGACAGCAGTGGGGAGCCAAACT^^ 10800 

nCTTCTACGCATCCCTTTTGGGGGTGTGTGTAGCCCTTATTTCGTU 10920 

MASS060RLCR 11 

AAAGGGGCAGGCGTGTATAAGAGGGCCCCTGTnAATACGCGGTCTGCCGTO 11040 

SMAVRRKTTPSYSGQYRTARRSVVVGPPDDSODSLGYITT 51 

CTCTAATGCAGT6CGTCGTAAAACAACGCCTAGTTATTCCGGACAATATC6A 11160 

VGADSPSPYYADLYFEHtHTTPRVHQPHOSSGSEDOFEOI 91 

AGTTGGGGCCGATTCTCCTTCTCCAGT6TAC6C6GATCnTAnnGAACATAAAAATACGA 11280 

DEVYAAFREARLRHELVE0AYYEHPLSVE1PSRSFTINAA 131 

CGATGAAGTAGTGGCCGCCTTTCGGGAGGCCCGTTTGAGACATGAACW^ 11400 

VKPKLE0SPKRAPP6A6AIASGRPISFSTAPKTATSSVC6 171 

6GTTAAACCTAAATUGAGGATTCACCGAAGCGAGCTCCCCCGGGA 11520 

PTPSYHKRYFCEAVRRVAAMQAQKAAEAAINSNPPRHNAE 211 

TCCTACGCCATCATATAACAAACGCGTCnTTGTGAAGCGGTCCG6CGCGTAGCCGCCATGCA66CACAAAAGGCTGCCGAAGC6GCnG6AATAGTA 11640 

LDRLLTGAVIRITVHEGLNLIQAANEA0LGEGASYSrR6H 251 

ATTAGACCGTnGTTAACCGGAGCCGTTATTCGTATTACG6T6CATGAGGGTTTAAATTTAATACAAGCC6CTAATGAAGCAGACCTAG6T6AAGGAGCAT^ 11760 

NRKTG0LQG6MGNEPUYAQVRKPKSRTDTQTT6RITNRSR 291 

TAATCGAAAAACTGGAGAnTACAGGGGGGCATGGGTAATGAACCTATGTACGCACAAGTtt^ 11880 

ARSASRTOTRK- • 302 

GftCCGTTCTGCATCAAGAACTGATACGCGAAAATAtroATATAATTAC^ 12000 

ATAAATGTGTATTATATCTCACATAnATAAACT6TTTAAATAGTACCACGTGGTATTATGAACAGnTATAATCAGnGCTACCAAACAAACCttAnmC6GC 12120 

HECNLGTEKPSTOTVNRSCTEQAVVOA 27 

GGAATCGCTTATTTAAACTAAAGAnTTACTCTATAAGTATGGAGTGTAATW 12240 

FDESLFGDYASDIGFETSLYSHAVCTAPSPPIVASPIILY 67 

TTTGATGAATCGTTGTTTGGTGATGTAGCATCGGATATTGGATTTGAAAtf^ 12360 
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Ft Q F T C A F H P ^ m ^^ m ^J^^..„^ nklltrAU ^x, 



*« ftC ccwiJipRSVGFLTHHl«Rlt 347 

* 410 

c >< B X B >< C A 8 

reiteration Rl sVSQVCSRDA 230 

A >< ■ B X 6 X * I ._.„> 

""""" vuCP4 wsrELNVSrRCVEP 270 





VZV DNA sequence 1767 

^Ug^^ 1S *° 

GI gI»™kt»™ 15600 

,.„.,.n»«ieHANLLLLLl$6A»lY66CSIYIPR 710 

IU L!gtIcC^ 15720 

Itattttaga^t^^ ™° 

Lkgg^tt^^ 15 *> 

0 L A E N L H Q Y R H E I L 6 _J:_ 0 __!!. * . ^„!.,^ 1 !!^l! TY T.rTirLi! iTlTTiinfJiTTfiTGinTnTTCITTAGGATGAAAiG 16080 
ACCTAGC 



UCGTTTCCTAGCCACACCCACAAAGGAGmGTAAAATAA^^^^^^ ™» 

y t S ft F A R S F S S D 0 B T R t S T 0 G S Y 0 S F N A 6 E R 0 I P I P » 

TGTCCAIATCGCAAIGTTtICTCGGITTGCGCGTICCITITCCAGC6ATGAIAGAACGCGTAAATCTTAT6ATGGIAGT^ 16320 

,. n .i-ct<!0(IITSERYRDGCLlPIPC£»L£TAVtALSF.II 7$ 

mCCGWKT^ 16 "° 

CGACAGttTAAtATCGCCGGTTTTWAAAGTACCGAAAGACACAW 16580 

,cun.»iiFIIORtL00LtLSG0GL»«YYEMTYI(0YltYTI 156 

tG^TI^G^ 166,0 

i i i i i i i ( t I (I I I S I S H L I f S ! 1 M I I ' I S • II f f I S I ' 136 

A6GAGCCGAGGIACCG6T6ACTTUGA6AA66tAAATAUAAGICIAAATCCA^ 16800 

tataa*™^ "» 

. I ,. vl . T6 YBiyiKFLOYY0IRYtRHLCLOFRRIR6 276 

acgtgtkwgamttgcg6cttIcicggctactgktacgggtm^ 17M0 

gc^g™!™ 17180 

. i i P I s V L I R I E E » R 0 G L N G T A A » I Y A » Y E I I T I I H H H F 0 3S6 

TKGCTTTUlAnA™ ™ 

». , „ „ y I IBYAC»GOGGLIIDPYILIALRA«GRFLYFAGOL 396 

ATACTTAATTAATATGATGCTTATTG6ATATGCATGn6GGGG6ATGGGGGATTAAACGATCCnATATATTAAAGK 17400 

yDTucTHSi¥YLETSTHMlFSRAYAQSILAH66KPYKYYA 436 

6GICAGAACAATGTMACACACAGTYG6Gn6I6TTA6AG»CCAGCACCCATATG16GnTTCCCGG6OT ™» 

„ . , i icrnvTPLHLRRISEPSSYSOQPY IRFMRL6SPIGT 476 

T c!gG,TCTT<£g^ 17M ° 

r i r y I F C Y C L T G N Y t S D D Y N A S S H Y I » T E A P I N S I » P 0 T N 516 

AMTAIAGreAAnTGGAATGTGTCIGnTAACGGGAAAnA 17760 

TAGACAaHacnCTCGMTTTTAGTO 17 **° 

ftuwtoiriFrtGPYROES^TMFOORSOLRHIETOASLMOH $96 

TGATCATTACACCCMATAAAGGC6GAATAT6AAGGTCCGGTTCGGGITGAATCAAACACAATGTTTGACCAAAGA 18000 

yvcyipPfEYGFNSSSOLDYDSLNGYTSGOMHTOOOLSPD 636 

CGTATATGAAAATATACCACCCAAGGAAGT6GGTTTTAACTCATCTTC^ ««» 



TTnATACCCAACGACGTTCCCGTTAGATGIAUAtCACGGnAM^ 



661 
1S240 
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66TCATTTCCACAACCAGGTTAAAATTGGGOTATTTGC^AGAAM «360 

MGDLSCKTKVPGF 13 

AGG AAT T ACTtCTTT AAATTTT AC TT AATGTACCCACAITATCUGTG6TCGT TT6T ATTTA AC6ITTATT ACCG6T ACC ATGGG16ACTT6TC ATGTTGC AC AAAGGTGCC666TT TT A 18480 

TLTGELQYLKOVOOILRYGVRKRDRTGIGTLSLFGNQARY 53 

C6TTAACC0mACTTCAGTACTTAAAACAA6TGGATGATATTTm 18600 

N L R N E F P L L T T I R V F « R A V V E E L L ■ F I R G S T 0 S t E L A A I 0 93 

AnTKGAAATGWTTTCCTCnTTAACTAUAAGCGTGTTTnT6GAGGGCCGTCGTGGAAGAGTTGTTATGGTTTATCC6CG6GTCAACCGATTCCAAAGAACTC6CCGCTAAA 18720 

IHIWOIYGSSKFLNRHGFHKRHTGDLGPIYGFOWRHFGAE 133 

TACACATATGGGATATATACGGATCGAGCAUl^CTAAATAGGAATGGCTTCCATAAAAGACACACGGGGGACCTTGGCCCCATTTACGGCnCCAGTGGAGACAn 18840 

YKOCQSNYLQOGIOQLQTVIOTIKTHPESRRNIISSIHPK 173 

ATAAAGACTGTCAATCAAACTATTTACAGCAAGGAATCGnCAGCT6CAAACTGTW 18960 

OIPLMVLPPCHTLCOFYVANGELSCQYVQRSGOMGLGVPF 213 

ATATCCCCTTAATGGUCTACCTCCATGTCACACGTTATGTCAGTTTTAK^ 19080 

HIAGYALLTYIVAHVTGLKTGDLIHTMGOAHIYLNHIOAL 253 

AUTTGCTGGATATGCACTTCTTACCTACATAGTAttGCATGTTA^^^ 19200. 

KVQLARSPKPFPCLKIIRNVTOINOFKBDOFQLDGYHPHP 293 

A*GTGCAKTAGCTCGATCCCCAAAACCTTnCCTT6CCTTAAAATTATTCGAAATGTAACAGATATAAACGACTTTAAATC^ACGATTTTCAGCTTGATG6ATATAATCCACACCCCC 19320 



PLKMEMAL- 



301 



CCCTAAAAATGCAAATGGCTCTTTAATGGATTTTTAAATGTTGTCAAGAC^^ 19440 

- S C 559 

CAACGGATGCATAGGGTTGCGATAACTGCGATAAGACCCAATGTCCCAAGGATAGATAT^^ 19560 

C R I C L T A I V A I L 6 L T 6 L I S I V G I I Y A V V S F T T A Y T S A D Y T 519 

TAAACGGCCGAAAAC6GAGGGAATTTTTTAGGGTAACCATCTAGAM 19680 

YVASFPPFKIPYGOLHCSYTIPGDLESTPCSSQMNYFRKH 479 

TGATCACACACCCCAGTAATCTCACTGTTTTCGTGGTTAATGGGAGAAK^ 19800 

Q 0 C V G T I E S N E H N I P S 0 N Y « I Y F K V Y G R P Y C I A T C Y I H G 0 439 

ATAATACCTATAGACACGTTGGGAAAT6GATAGACGTCAGGGGTAACGACAGCAGAA 

IIGISYNPFPYVOPTVYASYKMNSYG0RIILYNCQIAPPP 399 

TAATCCATTTGTTTTTGCTGTGGAATTC6TACCGCCGAAACATAACTAAATAATCCATTGGCATATTCTTGTATT6CATCGGTTATAAAATTTTTTCCGATGTTACCAAACCTTGAAGTC 20040 

YDMQrQQPIRYASYYSFLGNAYEQIADTIFHICGlNGFRST 359 

CACCGAACACGTACCGAGTGCGGTGGATAATACTTTGATACGTTACAGTAGGCT^ 20160 

« R V R V S H P P Y Y t S Y N C Y A A Y T Q G T I Y P 0 G V Y T I N P fl Y Y T T 319 

ACTGTAATACTGT6TTCCGATATGACGTTCTTAGTTTTTGTATTAACGACTCGCCAAATATACGTTCCCTCCGTGGTAGCATCCATA6ATAAAATTGTTACAGAAAAATCAGACGTTGTT 20280 

V T I S H E S I V N I T I T N Y V R * I Y T G E T T A 0 M S L I T Y S F 0 S T T 279 

TTAACATCTGGTATTACATAATTTTCCTTAGCGTGTGTAAATATCTCAGWTTGTTTAnAAGTTTAAATCGGCACTGTTGCTATATAACATAACCGGU 20400 

I Y D P.I Y Y N E t A H T F I E P N N I L H L 0 A S N S Y L M V P L 0 P M R I L 239 

6CAnGCCCAGTTGACGGTKGGATCTATAA6GTGACGCGTAAACCAAACTTCAATATGAAGATCGGGGCGTATAAGCGACTTCCACCTTGTTATATTTGAACCnCC6GATCTAAAGAA 20520 

AHGLQRHP0ILHRTFWYEIHLDPRILSKIRTIHS6EP0LS 199 

TATTGTTCATATGTTTTTTGTTGCTGCTTAUGGCCGCCTGTTGTC^ 20640 

YOEVTKQQQKFAAQGGTTLRNYCPMIFTHFYPISQIGGHY 159 

< 

< ft X B 

CATTGTATATTnCATATAGAAAAGGT66TTGT6AATGTT6GGTGTT 20760 

COIMEYLFPPQSHQTHAAPDPKRSAASYPAYAPOPIRTAA 119 

X A X B >< 

GAGGTG6GCGCGACGGCGGGATCGGGCTTTCGGGA AGCGGCCGAGGTGC^GCGACGGCGGGATCGGGC TTTC6GGT AGCGGCCGAGGTGGGCGCGACGGCGGGATCGGGCTTTCGGGA A 20880 

S T P A V A P D P t R S A A S T P A V A P 0 P K R T A A S T P A Y A P 0 P I R S 79 

- reiteration R2 - ■"" 

ft X A X (* X 

GCGGCCGAGGTGGGCGCGACGGCGGGATCG6GCTTTCG6GAAGCGGCCGAGG 21000 

AASTPAVAPOPtCRSAASTPAYAPDPKRSAASTPAYAPDPK 39 



CGGGTAGCGGCCGAGGTATATAATTCAGTTATACTTACG^ 21120 
RTAASTYLETISYPTPQSETSIOICAITLILNIQIRKM 1 



VZV DNA sequence t 1769 

inATTGACACnCCACttTCCCCnAUTAAUGUT ™" 
UCnTATTTnCTCGATUCGmCA^ 2 ™ 

TTAACGCTTAGTCTCATCATCTGAATACA^ ™ 
K V S L R M H Q I C T L 6 A A ¥ T S T L I I I A 6 V S I 6 A F T Y A I 6 I » 

GTTGTATATAATACAAnACCCATAGGTTnTTTTTlCn^ 

T T Y L V I V « L H C K E Q t L V A F 6 Q L 6 F A I I H 6 G I S ¥ A L T V g T L «« 
TTGATAAUTGATTTAATT^^ 

r ! f H H L K 1 I H S I A I M L P T V I Y L V P P I V S H I R I 6 K F V R K L T 



21720 
2S4 



CGGGAACTnCTCGATGGTCACATACTCTCCCfiCGGTCAnnGTGTATATACAACGGCAAAACCTAAATCTGTATAA *«J 
R S S E R H 0 C V R 6 R 0 N Q T Y V V A F G L D T Y T N L Q t H R H I R Y I C T 

21960 
174 



22080 
134 



0 0 L 0 T » » 0 V I S V L T V I S I C I 6 I L Y E Y I S V Y H Y I I » I 

unwnctt^iaTuu^ 

CCC»GTUTTCTJC»HT«^ *S 
UTTTUCBGATGn^^^ ^ 

mraiuTWOKTCTiT^ 

NEDHVtREGNVIH 

GnKim.mUCTGTKHm^ » 
GaUGIGGTi™— "S 



22920 
292 



nUG^TUWnUTHMOC^^ 



L»lYFNIYCTEORTLtLNQV»SBFGS 

CG^TCT<*»G»H«T^^^ 
ATQIYMOTATLIFCTTFGFTKLLRQLRTrUDHRLLiaiur 

GAGTTAGATTTGGAAACATCATTGTATAACAAGCGAGTTCACGTTTTACAACTTGnTGTAACATTGT 
TLHPFMMTYCALERKVVQKYCQV00DP6COGPRQKVMTQI 

TAATACTCCGCTCG^TTGT^^^ 
I S R E P P Q 6 T F I F Y L R T P T S P 0 I T H R F A 0 I L S S R 6 E J A A L Y 

CAACCATTCTCG6CCCAG™ 

VMRPGTKNYQDFMKAPIP IYLQETELSCHIIL6E0SF 
HOA*™^ 

CCAAIGCATCGTCTGTACGCGICCTCUATCCATAATTTACW^ 23880 
LA00TRSRL0M 

GGTGACTTAA6TACATAAACAGGCATGATATnGAATAGTACG6CCCATGG6AGGGAACATnCCACGT6nCCAATACAGGG6^ 2*000 
AGAAGnACCA6AnTGATGTAATGTn6TCATAAAAAATAT6TACATCAnATATAC6TCT6TAAnAACACAAGATCACATC6AAGUnACT6AAGCCGCTC 24120 

H 6 L F 6 L T R F I H E H t L V t P S I I S T P P G ¥ L T P V 31 
GACGATATUACTTKmAGTGTAnGATGGGGCT^ 24240 



23160 
212 



23280 
172 



23400 
132 



23520 
92 



23640 
S2 
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A¥DViNYMYTLLERLYPYGIRENlH6PSYTIHCLGVLLRL 71 
GGCGGUGACGTATGGAACGTCATGTACAlUnGTTCGAAtt^ 24380 

L T Q R S Y Y P I F V L E R C T D 6 P L S R G * t A I « S R A M H H D E R 6 T S 111 
ATT AAC AC AACG6TC AT ACT ATI 



ATCCGATAffTGTAnGGUCGTTGTACAGACGGCCCAnATCACGTKAGCCAA 2<4«* 



0 L T R V L L S S N T S C S I I Y N t T S E ? Y 0 S V F R H S S T S C I P S E E 151 

GGACTTAACCCGTGTTCTACTATCATCCAACACATCATGTTCTATCAAGTATAACAUACATCGGAAACATATGACAGTGTGnTCGAAACTCTTCCACGAGTT6TATTCCTA^AAGA 24G00 

NKSQDMFlDGCPRQTDCTICLRDQNVCSLTSTIIPSRGHPN 191 

AAACAAATCCCAGGATATGTTTTTGGACGGTTGTCCACGACAAACTGACAAGACGATCTGCCTGCGCGACCAAAACGTATGCAGTCTTACCTCTACAATGCCATCCC6AGGACATCCTAA 24720 

H R L Y H C L C A S L I R « M 6 Y A Y ¥ E A Y 0 I E A 0 E A C A N L F H T R T Y 231 

CCATC6ATTATATCACAAAffGTGTGCAAGTCTTAnAGAT66ATG6G^ 24840 

271 

GGCTTTGGTTTATACGACAGATACTGATTTACTCn^ 24960 

311 

TACATACCCTGAATTTTTGGTTWCTTTGTTCGCTGTCAGAC^ 25080 

0 T S T R S P T Y D S « R H G E V F I S L T V A T S G I T E N 6 V S Y S I Y A S 351 

GGACACTTCAACGCGCTCCCCCACTTACGA^^ 25200 

H R S E V T Y 0 A S I A L H L I P P 5 S S P L 0 N L E R A F Y E H I I A Y V T P 



ALVVTTDTDLLFKGCOILLOAIPHFAPVYRCROLLQYLGI 
&TTTGGTTTATACGACAGATACTGATTTACTCTTCATGGGCTGTGATAW 

TYPEFLYAFVRCQTOLHTSDNLKSVQQVI00TGLIVPHQII 



391 



TAACCGATCGGAlKTGACAGTAGACaCAGn^ 25320 

L T R G R L K L M I R Y N I M Q N T A 0 P Y M V I H T L V H H L t 6 E I H A R Q 431 

ATTGACCCGCGGTCGttTAAAGTTAATGUACGTGTAAATATTATGCAAAATACG^ 25440 - 

YARIFKQF1PTPLPLNTYLTKY1IH-' 455 

ATACGCACGTATTnTAAACAGTTTATTCCTACTCCACTCCCACTAAACACTGTATTAACAAAATATTGGAAnAAAACACACATAAGAGCGACTTAATGGnCATTGTTTTATTTTGCT 25560 



25680 
272 



CGTATATACATGTTATAAATCGTTTATCACTGTGCCCGCATAAGATGTACTGTGTCTCTC^ 

-LDNIYT6AYSTSHREFFNTHKDAIMFALPF0S0PP 

TGGGGTGTTAAATAGTTTTDGTACATTAATCGCTGATAAAAGCCTGTCC^ 25800 

P T N F L It P V H I A S L t R 0 A S F It V Y Q T I A 0 Y N Y L R T IE P A H S I I 232 

AAACGCACACTCGATTTCAACGGCTTCCCAA^^ ^ 

F A C E I E V A E S F L 0 H I R T I A P I E P V Y N N Y I C C S A S T H I A E 0 192 

TCGGCTTATAAGGTCGmAATTGACAAGmC^ 

R S I L 0 N F Q C T Y Y F L G H H R L Y A I A A F S S Y F F I 6 E I L I « L I Y 152 

TnnCTGCAACGGATGGGnGTCCCGTACCnTTCTTCCAACCAnGTACTTTTTGTTGGATCGACGGAnATTAATAGTGACATTTACGTATTGTA 26160 

I E k Y S P M D R Y It E E L • Q Y K Q 0 I S P H H I T Y H Y Y Q Y R I S E 0 6 R 112 

GAACAACATTAGnGAAmGACTATAGACACGCGCGTGGACAACCTCGATGCACTCTTGTTCAATGTAGTAATGGTGAATATCCTT 26280 

F L M L 0 I 0 S Y Y R A H V Y E I C E Q E I Y Y H H I 0 K Q S F L Q T L S G L H 72 

AACATTTACCAGATCATCTGCCGCCGATUAAATGTAAAAATAAATCTGTAGAATAn 26400 

VHYLOOAASLFTFIFRYFILEOETLCOLYQYDOEIIFOSE 32 

TAACCAACGATTC6AAATGCTCAGGGCACGTAUTT6TnATATCT6G^ 26520 

L « R H S I S L A R L H N I 0 P C E P R Y F F H S C 0 K 0 0 M _ 1- 

TAAAGCACAACTGGTACAGGTTAAnCGCCTCCCGCAAACAGTCCGCTGTTCGTAGCTTTACGAATTTTACAGTAGTACA 26640 

L A C S T C T L E G G A F I G S N T A It R I I C Y Y N G T t L G A I Y A R I L I 736 

ATTCATTATTTTGGAGGC6GGAAnGTCCCGTCT6GGCGTTCCTM^ *£ 

N M I t S A P I T G 0 P R E E I F L T H S Q S Q 0 I F P A R E A C M 0 V L 0 E 0 696 

CTCATATTCAAACKTGTTnATATTTTAA6A6T6G6TGA "J" 

EYEFATKYKLLPHSNSLCGFSRIYSBQHItELIHIYELRit 656 

TTCACATGAATACATATCtCTTAGTTCGTCCATAAGGTCTAAGTTGGGTCTAAGTAACTCACCCGAGGTGGTGACCTTACTAAACATATTATTATAAA 27000 

ECSYMDRLE0MLDLNPRLLEGSTTVKSFMHHYIPSF6ESC 616 

CTCCGTTACCT6TGCAGATGAAACTGTGG6CAnAACGCTAAGAACTGCGAGTTGTATAACCCATAAK 2 ™° 

E T Y Q A S S Y T P H L A I F Q S N Y L G Y A C I 0 0 R L T C V P L 0 L Y S I T 576 

AGAAAACCCATCTTGGTGTAACCATCCCTTAGCATATTTACTT^ 27240 

S F G 0 0 H L I G It A Y t S E T F G t F P A L G G I K C M E M S T t M T E Y I M 536 
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VZV DNA sequence 1 " 1 



BELRLtTSTFAEGHLTRLIO*Ri^o^»- , ■ ,, 

11™ "S 

LYF0EVrL*SQL6LTAt-*P'« u * Lrrct * 

TGCnGGCGGttTCG™^^^^ ^ 
AQRVRPTLCLlPSTLWTEHSVSHLHlKtni 

„™TuccGccnuo™^ ^ 

DORIViKL«S»ITSIVENLE»PTT5trrii 
NIIOHVTNFIRtFEM 

TCCH— ^ 

RPSTPDNSLQVLPLARTNtPQHTH»»wci 

LLFGI6R6FC0ENL0SLALHLRVoERURnUt 
TUGTCCGTCC«GGm« 

TGfltCJCGGC6GnjT*IIGCn*UtTiTCtT6tJTGGT4TnW "£ 
UHttTTjmGCCGCGTTM^ T 
T „„G«GTTJ«m«TCT^ 
T«g™TTC^ 



30000 
159 

30120 
119 
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TAGACGC($TTTCTGTACGTTTT(£TGGTACATGTATCG6TTGTro 30360 

SATETRKPPVHIPQKNLGGFRP6ITLDRSMNGFF6PAYTH 39 

TATGAAGCCCGTTCCCTAnTGAATAAAACGGnATnCCTAAAAGACTGAW 30480 

HLCH6IQIFRNHGLLSIHTCCLTQEHLTFHSHTPQSGM 1 

AG66GTTAAATTCACACAATGCAATCGTGACGTGGTACTATCTGAAATGTTGCCTGGK^ 30600 

AAAAATAnTTGATTGGCGTTAAAAGGTTCTTCAACTTACCOTGACGTCCTTW 30720 

21 MEEPICYOTQKLLDOLSNLKVQEADNE 27 

ATATCACGTTATAAAGTTAAGTCAGCGTAGAATATACCATGGAAGAACCAATTTGm 30840 

RPISPEKTEI ! ARVK¥VrFLRSTOriPArHFIQIIEPLHSM 67 

GACCATGGTCACCAGAGAAAACAGAAATCGCCAGAGTTAAGGTAGnAAGTTnTACGATCTACCCAGAAAAnCCAGCTAAACATTTTATTCAGATAT6GGAACCCCTSCATTCTAA 30960 

ICFVYSNTFLAEAAFTAEHLPGLLFIRLDIOITIEEPGNS 107 

TCTGTTTTGTATATTCCAATACATTTTT6GCW^ 31080 

LKILTQLSSVVODSETLHRLSANKLRTSSKFGPVSIHFII 147 

TAAAAATmAACCCAGCTATCAAGTGTAGTACAAGATTCCGAGACGTTACATCGTTTATCGGCCAATAAATTACGAACCTCGTCTAAATTTGGACCCGTTTCGATA^ 31200 

TDVINHYEVALKOATTAIESPFTHARIGHLESAIAAITQH 187 

CGGACTGGATAAATATGTACGAGGTCGCCTTAAAGGATGCAACAACAGCCATTGAATCACCATTCAtt^ 31320 

KFAIIYOMPFVQEGIRVITQYAGNILPFNVMNNQIQNSSL 227 

UTTTGCGATCATTTACGATATGCCATTTGTTCAAGAGGGGATTCGT6TTTTAACACAATATGCAGGATGGCnCTTCCGTTTAATGTTATGTGGAATCAGATK 31440 

TPLTRALFIICMIOEYLTETPVHSISELFADTVMLIKOE'a 267 

CTCCTCTAACACGAGCCCTTTTTATAATCTGTATGATTGATGAATATCTCACGGAAACGCCAGTACATAGCATATCAGAATTAnTGCAGATACTGTAAATTTAATTAAAGATGAGGCG 31560 

FVSIEEAVTNPRTVHESRISSALAYROPYVFETSPGMLAR 307 

TCGTATCCATCGAAGAAGCGGTAAC6AATCCAC6AACGGTGCACGAGTCACGAA 31680 

RLRLONGIWESNLLSLSTPGIHIEALLHLLNSOPEAETTS 347 

GACTTAGAnAGACAATGGTATATGGGAAAGCAACCTCTTATCGTTGm 31800 

GSNVAEHTRGIIEIYOASTSPSMLISTLAESGFTRFSCKL 387 

GAAGTAATGTAGCAGAACACACCC6TGGCATTTGGGAAAAGGTTCAGGCTAGTACATCGCCTAGTAW 31920 

LRRFIAHHTLAGFIHGSVVADEHITDFQQTLGCLALY6GL 427 

TACGTCGGTTTAnGCTCACCACACACTCGCCGGTTTTATTCACGGAAGCGTT6TAK^ 32040 

AYQLVETYAPTTEVVLTYTRTVNETEKRYETLLPAL6LPP 467 

CATACCAATTAGTGGAAACGTACGCTCCTACTACCGAGTATGT6nAACAT^^ 32160 

GGL6Q JMRRCFAP'RPLIES I QA TRV I LL NEISHAE ARETT 507 

GAGGCCTGGGACAAATTATGCGGCGCTGTTTTGCTCCACGACCCCTTATTC^ 32280 

YFKQTHNQSSGALLPQAGQSAVREAVLTIFOLRMOSRMGI 547 

ATmAAGCAAACACATAATCAATCCTCAGGTGCGTTATTACCACAAGCAGGACAAA6TGCCGTACGCGAAGttGTACTAACCTGGniGACCTACGT 32400 

TPPV0VGMTPPICV0PPAT6LEAVMITEALKIAYPTEYNR 587 

CTCCCCCGGT(KiATGTGGGTATGACACCTCCTATTTGTGTT6ATCCACCGGCTACA6GGTTGGAAGCTGTCATGATAACAGAA 32520 

SSVFYEPSFVPYI IATSTLDALSATI ALSFDTRGIQQALS 627 

CTAGCGTGTTTGTGGAACCGTCGTffGTGCCnATATTATTGCAACAAGCACGC^^ 32640 

ILO>AROV6S6TVPNA06YRTKLSALITILEPFTRTHPPV 667 

TTCTTCAGTGGGCTCGCGATTAT(»ATCCGGAACCGTGCCCAATGCAGATGGATATCGCACAAAACTATCTGCTCn 32760 

LLPSHVSTIOSLICELHRTVGIAVDLLPQHVRPLYPORPS 707 

TTTT ACC ATCTC ACGTTTCT ACT ATAG ATTCCCTT A T ATGCGA AC TTC ATCGGACTGTTGGCATTGCCGTTGACCTGCTTCCCC AGC ACGTCCGTCCTT TGGTTCCTGACCGTCCTTCT A 32880 

ITNSYFLATIYYOELYGRITRLDKTSQALVENFTSNALVV 747 

nACAAATAGCGTTTTTTTAGCAACTCTCTATTATGATGAACTTTACGGTCGU 33000 

SRYHLMLQKFFACRFYPTPOLQAVGICMPrVEROEQFGVW 787 

CTCGGTACATGTTAATGTTACAAAAATTTffTGCGTGTCGTTTTTATCCAAC^ 33120 

RLNDLADAVGHIYGTI06IRTQMRYGISSLRTIMA0ASSA 827 

GTTTAAACGATCTTGCTGATGCGGTTGGTCATATTGTTGGGACAATACAAGGAATCCGAACGCAAATGAGAGTGGGAATATCCAK^ 33240 

LRECEHLMTKTSTSAIGPLFSTMASRYARFTOOQHDILMR 867 

TT AGGGA ATGTGA A A ATT T AATG ACTA A A ACC TCCACTTCTGC T ATTGGGCCTCTTTTTTC AACG ATGGCTTCCCGGT ATGC ACGGTT T AC ACAGG ATC A A ATG6ACA TTTT AA TGCGTG 33360 



VZV DNA sequence 



1773 



' VDKLTT6ENIPGLANVEIFLNRKERIATACRHATAVPSAE 907 

' TTGACAUCTAACAAUKAGmATAHCCCGW^ 33480 

SIATVCNELRRGLrNIQEORVNAPTSYMSHARNLEOHKAA 947 

CTATTGCAACCGTGT6TAATGAATTGAMCGCGGmAAAAAATATACAAGA6GATCGTGTAIATGCCCCAACCTCATATATGAGTCAC^^ 33600 

VSFVKDSRQQFIVDS6PQHGAVLTSQCNI6TVEHVNATFL 987 

TTTCATTCGTTATGCACTCCAGGaACAGTTTATTGT^ 33720 

HONVKITTTVRDVISEAPTLIIGQRILRPDEILSNVOLRL 1027 

AT6ATAATGTTAAAATAAC6ACAACGGTCAGAGACCTAATTTCAGAGCCTCCGACGCTBATAATAGGACAAAGAT6GCnCGTCCA6ATGAGAnTTATCTAATGTAGATTTGC6TCTTG 33840 

GVP6HTS6SDP- * 1038 

GCGTACCC6GGAATACAAGT66GAGTGACCCnAATATAAAACA6GCGTGTnATGTACATTAAAGU 33960 

GTAnnCATAACCTCCTAGGTTnTGGAGCTACACGTGCTTATTCAACGCTCTT^^ 34080 

M0I1PPIAVTVAGVGSRNQFD6ALGPASGLSCLRTSLSF 39 

AAATGGATATAAnCCGCCTATAGCTGTCACTGTTGCGGGAGTGGGAAGCCGTAATCAATnGAC6GTGCCCTGGGACC6GCGTCAGGTCTGTCATGTTTAAG 34200 

LKKTYAH6INATLSS0NI06CLQE6AAWTT0LSNMGRGVP 79 

TGCATAT6ACATATGCGCATGGAATTAATGCAACCCTGTCATCAGACATGATT6A 34320 

0MCALV0LPNRISYIICLGDTTSTCCVLSRIY6DSHFFTVP 119 

ATATGTGTKTCTTGTTGATCTCCCCAATCGAATnCATATATTAAAtf 34440 

DEGFMCTQIPARAFF00VNM6REESYTIITV0STGHAIYR 159 

ACGAGGGTTTTATGTGCACACAAATTCCCGCTAGA6CGTTTTTCGATGATGTGTGGATGGGACGTGAAGAGTCGTATACAATTATAACTGTAGACTCAACGGGAATGGCC 34560 

QGttlSFIFOPKGHGTIGQAVVVRVNTTDVYSYIASEYTHfl 199 

AGGGAAACATATCTTnATnTT6ATCCACATGGCCATGGGACTATAGGACAGGCTGTAGTT6nC6GGTGAATACCACGGATGTGTACTCTTATATCGCATCG^ 34680 

P 0 H V E S Q I A A A L V F F V T A N D G P V S E E A L S S A V T L I Y G S C D 239 

CC6ATAACGTAGAATCCCAATGGGCC6CTGCATTAGTTTn^ 34800 

TVFTDEQY. CEKLVTAQHPLLLSPPNSTTIVLNKSSIVPLH 279 

CATATnTACAGATGAACAATAnGCGAAAAACTGGnACAGtTCAA 34920 

QNVGESV5LEATLHSTLTNTVAL0PRCSYSEV0PVHAVLE 319 

AAAACGTTGGTGAAAGTGTATCCTTGGAAGCAACCCTACATTCAACGTTAACCAACACGGTTGCACTGGACCCTAGATGTAGTTACAGCGAG6TTGATCCTTGGCATGCGGTTCTAGAAA 35040 

TTST6S6YLDCRRRRRPStfTPPS$EENLACI006LVNNTH 359 

CUCCTCGACT6GGTCTGGC6Tm6GATTGTCGTeGTA6ACGCCGTtt^ 35160 



STDNLHKPAKKVLKFKPTVDVPDKTQVAHVLPRLREVANT 399 

CCACGGATAArnACATAAACCCGCTAAAAAGGTTCTCAAAnTAAACCAACTGTAGACGTGCCGGATAAAACACUGTGGCACAT6TATTAC 35280 

PDVVLNV$NVDTPES$PTFSRNMNV6SSLKDRKPFLFEQ$ 439 

CAGAC6TTGTGTTAAATGTATCCAATGTAGATACGCCT6AATCCAGTCCCACTTTTTCAC6GAACATGAATGTAGGAAGCAGTTTGAAAGATCGGAAGCCATTTCTATTTGAACAGA6TG 35400 

GOVNMVVECLLQHGKEISNGVVQNAVGTLDTVITGHTNVP 479 

6TGATGTCAACATG£nGTCGAAAAACTACTACAACATGGGCATGAAAnAGCAATGGATACGTACAAAATGCGGTGGGTACGnGGATACTGTTATT 35520 

INVTRPLYMPDEKDPLELFINLTILRLTGFVVENGTRTHH 519 

HTGGGTAACAAGGCCCTTGGnATGCCAGACGAAAAGGATCCAnGGAK^ 35640 

6ATSVVSDFI6PLGEILTGFPSAAELIRVTSLILTNMPGA 559 

GTGCTACAA6CGTTGTATCAGACTTTAT*GGTCCCCTTG6GGAAAnm 35760 

\ 

EYAICTVLRKKCTIGHLIIAKFGLVAMRVQDTTGALHAEL 599 

AATATG^TATTAAAACTGTTCTCCGGAAAAAATGTACAATTGGCATGCTCATTATCGCTAAGTTTGGTCTAGTTGCCATGCGGGTTCAGGATACAACCGGCGCTTTACATGCCGAACTAG 35880 

OVLEADLGGSSPIOLYSRLSTGLISILNSPIISHPGLFAE 639 

ATGTGTTAGAAGC&GATCTAGGAGGTTCGTCGCCCATAGACCTCTATTCTAGACTGTCGACAGGTCTTATAAGTATACTAAATTCGCCTATTATTTCTCATCCCGGACT^ 36000 

LIPTRTGSLSERIRLLCELVSARETRYMREHTALVSSVKA 679 

TTATTCCAACCCGTACAGGGTCCCTGTCTGAACGAATACGTCTTCTnGTGAATTAGTCTC6<H;CCGGGAGACACGCTATAT6CGTGAACACACCGCGCTTGTnCTAGTGTAAAGGCn 36120 



LENALRSTRNKIOAIQIPEVPQEPPEETDIPPEELIRRVY 719 

TAGAGAATGCAnAC6GTCTACCCGCAATAAAAnGATGCCAnCAAATACCAGAAGTTCCCC^ 36240 

EIR$EVTMLLTSAVTEYFTR6VLV$TRALIAEQSPRRFRY 759 

AGATAC6ATCCGAAGTTAUATKTAnGACCTCG&TGTTACAGAATACTTCACCCG€GGAGTGW 36360 
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ATASTAPIQRLLDSIPEFOAKLTAIISSISIHPPPETIQH 799 

CGACCGCAAGTACGGCACCCATTCAACGGCtnTAGATTCTCTTCCGGAATTCGACGCH^ 36480 

LPVVSLLKEIIKEGEDINTDTALVSVLSVVGEAQTAGYLS 839 

TCCCCGTCGTATCTCTGnAAAAGAGCTTATTAAAGAAGGGGAAGATTTAAACACAGA 36600 

RREFOELSfiTIKTINTRATQRASAEAELSCFMTLSAAVOQ 879 

GACGAGAGTTCGATGAATTATCACGTACAAnUAACCATTAATACACGCGCAACGCAACGGGC^ 36720 

AVKDYETYNNGEVICYPEITROOLLATlVfiATOOLVRQIKI 919 

.CCGTAAAGGACTATGAUCATATAACAATGGTG^GGTCAAGTATCCTGUAT^ 36840 

LSDPMIQSGLQPSIKRflLETRLKEVQTYANEARTTQOTIIC 959 

TAAGTGATCCMTGATCCAATCCGGTTTACAACCnCGAnAAAAGAC 36960 

SRKQAAYNKLGGLLRPVTGFVGLRAAVDLLPEIASELDVQ 999 

GTCGAAAACAGGCGGCATATAATAAACTCGGGGGGnACTTCGCCCGGTAACC^ 37080 

GALVNIRTKVLEAPYEIRS0LT60F1ALFN0YR0ILEHPG 1039 

GAGCCCTGGTAAATCTCAGGACCAAAGTCTTAGAGGCGCCGGTAGAGATCCGTTU^ 37200 

NARTSVLGGLGACFTAIIEIVPIPTEYRPSLLAFFGOVAD 1079 

ACGCACGCACATCTGTCTTAGGAGGACTGGGAGCTTGTTTTACAGCTATTATCGAAATTGTGCCGATACCTACG6At3TATAGACCATCATIGCTTGCGTTTTTTGCTGACGTGGCADATG 37320 

VLASOIATVSTNPESESAINAVVATLSKATIVSSTVPALS 1119 

TGCTTGCATCCGACATCGCGACCGTATCTACTAACCCGGUAGTGAGTCCGCCATAAACGCTGTTGTTGCAACTCTTAGTAAAGCGACGTTAGTTTCATCTACAG 37440 

FVLSLYKKYQAIOQEITNTHKLTELOKOLGDOFSTLAYSS 1159 

TTGTGTTGTCGTTATATAAAAAATATCAGGCTTTACAACAAGAMTTAC 37560 

GHLKFISSSNVDOYEINOAILSIQTNVHALHDTVKLVEVE 1199 

GACACTTGAAGTTTATATCATCTTCAAATGTAGATGATTATGAAATAAACGATGCGATATTATCAATACAAACAAATGTGCACGCCCTAATGGATACG6TTAAACTTGTTGAAGTTGAAC 37680 

LQKLPPHCIAGTSTLSRVVKDLHKLVTMAHEKCEQAKVLI 1239 

TGCAAAAGCTACCCCCCCATTGTATTGCTGGGACATCTAOT 37800 

TOCERAHKQQTTRVLYERKTROIIACLEAMETRHIFNGTE 1279 

CCGATTGTGAACGTGCACATAAACAACAAACGACTCGGGTTTTGTATGAGCGTTGGACACGTC^ 37920 

LARLRDMAAAGGFDIKAVYPQARQVVAACETTAVTALDTV 1319 

TGGCACGGTTGCGAGATATGGCCGCTGCGGGAGGGTTTGATATA^^ 38040 

FRHNPYTPENTNIPPPLALLRGLT1F00FSITAPVFTVMF 1359 

TTCGCCACAATCCATATACCCCCGAAAATACAAATATTCCACCACCTTTGGCTTT^^ 38160 

PGVSIEGLLLLMRIRAVVLLSADTSINGIPNYR0MILRT5 1399 

CAGGTGTTAGIATTGAGGGACTCCTTCTGCTTATGCGTATTCGCGCG6TTGTGTTATTATCC6CCGATACGTCTATTAAT6GAATACCTAACTACCGAGATATGATATTACGAACCTCGG 38280 

GOLLQIPALAGYVDFYTRSYDQF I TESVTLSELRADIR.QA 1439 

GGGATCTATTACAAATACCCGCATTGGCTGGGTATGTTGATTTTTACACACGGTCTTATGATCAGTTTATAACCGAAAGTGTAACGn 38400 

AGAKLTEANKALEEVTHYRAHETAKLALKEGVF1TLPSEG 1479 

CCGGGGCTAAACTTACAGAAKAAATAAGGCTTTKU^^ 38520 

LLIRAIEYFTTFDHKRFIGTAYERVLQTMVDRDLCEANAE 1519 

TATTGATTCGGGCTATAGAGTATTTTACAACnTCGATCATAAACGAn^ 38640 

LAQFRMVCQATKNRAIQILQN1VDTANATEQQE0VDFTNL 1559 

TTGCACAGTTTCGTATGGTGT6TCAGGCAACAAAGAACC6TGCAATACAAATTTTACAAA 38760 

KTLLKLTPPPKTIALAIDRSTSVQDIVTQFALLIGRLEEE 1599 

AGACGTTATTAAAACTAACCCCCCCTCCCAAAACAATTGCATTGGCCATTGATAG 38880 



TGTLDIQAVDHMYQARNI IDSHPLSVRIOGTGPLKTYKDR 1639 

CTGGTACGTTGGACATTCAGGCGGTTGACTGGATGTACCAAGCTCGCAATATTATTGACTCCCATCCACTAAGTGTGCGTATAGAC6GTACCGGCCCCCTG^ 39000 

VDtLYALRTICLDLLRRRIETGEVTlDDAITTFKRETGDML 1679 

TGGATAAACTTTATGCGTTACGAACTAAATTAGATCTCCTACGACGAC6AATAG 39120 

ASGOTYATSVDSIKALOASASVVOMLCSEPEFFLLPVETK 1719 

CATCGGGGGACACGTACGCTACTTCCGTAGATAGTATAAAGGCACTCCAGGCATCGGCGTCTGTGGTTGACATGCTTTGTTCCGAACCCGAA 39240 

HRLQKKOQERKTALOVVLOICQRQFEETASRLRALIERIPT 1759 

ACCGTCTCCAAAAAAAGCAACAGGAACGTAAAACCCCGTTGGATGT^ 39360 
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ESDHDVLRMLLRDFDQFTHLPIVIKTQVKTFRNLLMVRL6 1799 

AGAGTGACCATGACGTTCTTCGTATGTTATTACGTGATTTCGA^ 39480 

IVASYAEIFPPASPNGVFAPIPAMSGVCLEOOSRCIRARV 1839 

TGTATGCAAGnATGCTGAGATTTTTCCACCCGCGTCTCCAAACGGAGTATTT^^ 39600 

AAFNGEASVVQTFREARSSI0ALFGKNLTFY10T06VPLR 1879 

eCGCGTnATW6GGAGGCGTCTGTGGTGCAUCGTTTAG6GAA 39720 

VfiVCYKSVGVCLGTMLCSQGGLSLRPALPDEGIVEETTLS 1919 

ATAGAGTGTGTTATAAATCAGTTGGGGnAAACTTGGAACCATGCTATGO^ 39840 

ALRVANEVNELRIEYESAIKSGFSAFSTFVRHR'HAENGKT 1959 

CATTACGCGTGGCCAATGAGGTCAATGAGCTACGtAnGAATACGAATCCOT 39980 

NARRAIAEIVA6L1TTTLTRQYGVHKDKLIYSFEICHHLTS 1999 

ACGCACGCAGAGCCATTGCAGAGATATACGCCGGCCTTATAACAACAACAW^ 40080 

VMGNGLTKPIQRR6DVRVIELTLS0IVTILVATTPVHLLN 2039 

TAATftmATGGACTAACTAAACCAATCCAGAGAAGGGGTGATGTACttGTAm 40200 

FARLDLIKQHEYyARTLRPVIEAAFRGRLLVRSLDGDPKG 2079 

TTGCTAGATTGGATTTAAnAAACAGCATGAGTATATGGCCCGTACCCK^ 40320 

NARAFFNAAPSCHKLPLALGSNQOPTGGRIFAFRMADNKL 2119 

ATGCCCGGGCCTTTTTTAATGCCGCCCCATCCAAACATAAACTCCCGTTAGCTCTTGGATCAAACCAAGATCCTACCGGC6G6AGAATATTTGCATTTCG6ATGGCAGATTGGAAACTTG 40440 

V I HP OKITDPFAPKQLSPPPGVIEANVDAYTRIMATDRLAT 2159 

TTAAAATGCCACAGAAAATAACGGATCCTTTTGCGCCATGGCAACTnCCCCtf^ 40560 

ITVLGRMCLPPISLVSMlfNTLQPEEFAYRTQDOVDIIVOA 2199 

TTACTGTACTTGGGCGCATGTGTCTCCCGCCAATTm 40680 

RLDLSSTLNARFOTAPSNTTLENNTDRKVITDAYIQTGAT 2239 

GICTGGATTTGTCATCCACGCTTAATGCAAGATTTGUACCGCTCCCAGCAATACCACGTTAGAGTWAATACAGACCGTAAAGTAATTACAGATGCTTATATTCAAACCG^GCAACGA 40800 

TVFTVTGAAPTHVSNYT. AFDIATTAILFGAPLVIAMELTS 2279 

CAGTTTTTACAGTAACGGGGGCGGCACCAACTCACGTnCTAATGTAACAGCGTTTGACATAGCAACTACGGCTAmTAn^ 40920 

VFSQNSGLTLGLKLFDSRHUATDSGISSAVSPOIVSNGLR 2319 

TTTTTTCACAAAATTCCGGACTTACnTGWKiTTAAAATTATTCGATTCCCGGCATATGGCTACAGATTCGGGTATATCCTCAKCGTATCTCCCGATATTGn^ 41040 

LLHMDPHPIENACLIVQIEKLSALIANICPLTNRPPCLLLL 2359 

TACTGCATATGGATCCTCACCCAATTGAAAATGCATGnTAATTGTCCAACTAGAAAAACTGTCCGCGCTCATT^AAACAAACCTCnACAAACAATCCCCCGTGTTTACTGCTATTGG 41160 

OEHMNPSYVLVERKDSIPAPOYVVFVGPESLIDLPYIDSO 2399 

ACGAACATATGAATCCCTCTTATGTTTTATG<jCAACGAAAAGACTCGATTCCAGCTCCG0ATTATGTG^TCTTTTG4jGGGCCAGAATCTCTTATTGATTTGCCGTACATCGACTCCGATG 41280 

EDSF. PSCPODPFYSQIIAGYAPQGPPNLOTTOFYPTEPLF 2439 

AGGACTCTTTCCCCTCGTGTCCCGATGATCCATTnACTCGCAAATTATTGCCGGTTATGCGCCCCAAGGCCCC^ 41400 

KSPVQVVRSSKCKKMPVRPAQPAQPAQPAQPAQTVQPAQP 2479 

AGTCTCCCGTTCAAGTTGTTAGAAGnCCAAATGTAAAAAAATGCCCGTCCGGCCCGCGCAGCCCGCGCAGCCCGCGCAGCCCGCGCAGCCCG^ 41520 

< ft X A X A X A X A K 6 X A XI > 

< - reiteration fi3 > 

IEPGTQIVVQNFKKPQSVKTTLSQKDIPLYVETESETAVL 2519 

TAGAACCGGaACACAAATAGTGGTACAAAATTTTAAGAAACCCCAAAGCGTAAAAACAACCCTTAGCCAAAAAGATATTCCCTTGTATGTGGAAACCGAATC 41640 

IPKQLTTSIKTTVCKSITPPNNQLSDiKNNPQQNQTLNQA 2559 

TACCTAAGCAATTAACCACCTCCATTAAAACAACCGTTTGTAAAAGTAff^ 41760 

FSKPILEITSIPTODSISYRTNIEKSNQTQICRKQNOPRMY 2599 

TCAGTAAACCAATACTTGAGATTACCTCCATTCCGACAGATGACTCGATATCnACCGGACTTGGATTGAAAAATCAAATGAlACACAAAAACGGCATCAAAATGACCCTCGAATGTATA 41880 

NSKTVFHPVNNQLPSNVDTAADAPQTDLLTNYKTRQPSPN 2639 

ACTCCAAAACAGTATTCCACCCTGTAAATAACCAATTACCTTCTTG^TTGACACGGCAGCCGATGCCCCCCAAACGGACCTATTGACAAACTATAAAACAAGACAGCCGTCGCCAAACT 42000 

FPROVHTKGVSSNPFNSPNROLYQSOFSEPSOGYSSESEN 2679 

TTCCGCCGGACGTACACACATGGGGGGTATCTTCTAACCCGnTAACTCACCGAACAGAGACCTATATCAAAGTGATTTTAGTGAACCTTCTGACGGCTA 42120 

SIVLSL0EHRSCRVPRHVRVVNADVVTGRRYVR6TALGAL 2719 

CT ATCGTACT A AGTC TCG ACG A AC ATCGGTC A TGTCGtGTTCCT AGGt ACGTACGiGTTGff ^ 42240 

ALLSQACRRMIDNVRVTRKLLMDHTEOIFQGLGYVrLLLD 2759 

CACTGTTAAGCCA6GCATGTC66CGTAT6ATCGACAACGTTAGATAT^^ 42360 
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6 T Y I 

23 



•uemr.ni ""^mwoimuuw^ ™ 

' ■OHilSIKilfllSl 220 

QUAAPPS6TTPQVTSSQAGQ 180 

emTaoc T e,TCOTcr«^ 
cc« T oj T «r,c™^ 

CC«T,TCCTm™ 

libLASlNIGTSLTSSLIVNHS 115 

™«™{™ 43300 

vvruf^vbTNUIKirPDCOVOAIfll 75 

C «^«K~»TTUG«CCm«T^ ^ 

- * 0 I 154 

—GO 

GGAA 44400 

i"LV*uiTPlIAEIEPPILNEF»VS 74 
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285 
45380 

325 



R I S F Y F N 0 0 6 L T 6 6 P Q P L A A A L A N I t 0 C * R M ¥ 0 C S S S E ti R 
CGTUGTCATTnACTTTAATCAGGACG6ACTGACTGGA6GCCCTCAACCTTTAGCG$CCGCCnG^ 

T S 6 M I T C A E R A L I I 0 I E F E D I L I 0 K L t t S S Y V E A A « G V A 0 325 
WAA0TGG6AT6ATTACCT6CGC66AACGT6CAnAAAAGA06ATATAGA6TTTDAAGATATATTAATA6ACAAACnAAAAAATC6TCnACGTAGAAGCAK 45480 

tALLLLSGVATfHVOERTNCAIETRVGCtrSYIQANRIEM 
TTGGCffTATTAnACTGAGTGGGGTT(£TACnG6AAT6W^ 



385 
45600 



SRDYPKQFSKFTSEOACPEVAFGPILLTTLKNAKCRGRTN 
TCCAGGGACGnCCUAACAATTnCCAAATTTAWMGCGAGGATGCCTC^ 

TECMLCCLLTIGHYMIALRQFtRDILAYSANNTSLFDCIE 
ACCGAATGCATGnATGTTGnTATTAACCATAGG6CACTATTGGATCGCTTT6C6GCAGTnAAAAG6GATATATTAGCATACTCA6CAAATAACAC 

A V Y I H L 



405 
45720 



445 
45840 



485 



PVIHAlSLDMPirLltFPFN0EGRFlTIVrAA6SE 
KTGTAATCAATKAT6GAttCTAGATAACCCCATTAAACnAAAnTCCATnAAT6ATGA 45960 



L A A Q 



FCOLLCALSELQTHPIILFAHPTTAOKEVLELYKAQ 
TTTTttGATCTCCTATGCGCTCTCTCGGAAmCAGACAAACCCTAAAATTTTATTTGCCCATCCTACAACC6C6GATAAGGAAGTGTTM 



525 
46080 



MHLKPTRFFHANQPPMPHSYEMEDL 25 

N R F E G R V C A G I N T L A Y A F I A Y Q I F P R I P T A N A A F I R D G G L 565 

AACAGATTTGAAGGTCGTGTATGTGCTGGCCTGTGGACATTGGCGW 46200 

CFDOMQYRISPSNTPYRSMSRRYKSVSRSGPSMRVRSRTP 65 

IILRRHAISLVSLEHTLSKYV- 585 

ATGCTTC6ACGACATGCAATATCGCT6GTCTCCCTCGAACACACCCW 46320 



CRRQTIR6KLMSKERSYYRHYFNYIARSPPEELATVR6LI 
ATGCC6CCGTCAAACCATTC6AGGAAAACTTATGTCAAAGGAGCGGTCTGTC^ 

YP1ICTTPYTLPFNLGQTVADHCLSLSGM6YHLGLG6YCP 
CGTGCCAATTATTAAGACGACCCCTGTCACCCTTCCGTTTAACTTGGGTCAG 

T C T A S G E P R L C R T 0 R A A L I L A Y V Q 0 L H N I Y E Y R V F L A S I L 
GACATGCACTGCATCTGGAGAACC6CGTCTATGTCGAACC6ATCGGGCGGCTCTGATACTAGCATATGTTCAGCAGCTTAACAACATATACGAATATC6T6TGTTTCTTGCATCCATTTT 



105 
46440 



145 

46560 



185 
46680 



A L S D R A N M Q A A S A E P L L S S V L A Q P E L F F M Y H I M R E G 6 H R 0 225 
GGCGCTATCAUCCGA6CCAACATGCAAGCAGC6TCC6CTGAACCCCTAn^ 46800 



IRYLFYRDG0A6GFMMYVIFPGICSVHLHYRLI0HIQAACR 
TATACGCGTACTTTTTTATCGTGAT6GAGAT6CC6GAGGGTTTAT6ATGTATGTTATATTTCCGGGGAAATCTGTTCACCTCCATTACAGACTAATCGATCATATACAGGCCGCGT6TCG 

6YJ1VAHYKQTTFLLSVCRNPEQ0TETVVPSIGTS0YYCK 
GGGGTATAAAAT*6TCGCACACGTnGGCAGACAACATTnTACTGTCGGTATGTC6CAACCCAGAACAACAAACAGAGACTGT6GTGCCATCCAnG^ 



265 



305 
47040 



MCDLNFOGELLLEYKRL YA'LFOOFVPPR- 333 

A ATGTGTGACC TTAAC TTTGATGGAGA ATTGCTHTGG A AT AC A AA AG ACTC T ACtt^ 47160 

- S Q H L I A K P I C F V R R M I Q H S Q E E E T 1 E A E T N M I N D N 1160 

CACGGGKGTGTATACAAACAAAGCCTGCCGCCTGCAAGCGGTn^ 47280 

CPAHICVFGAAQLRNLMKVNVVRTEPIFRKLLRETLKTOH 1120 

CCAAATAACKCTTAAAAGTTACACTCGCC6TCCCAATGAGATGAGAAAAA 47400 

6FIACFTYSAT6ILHSFYYDINL$LGHSTVHIPDEAL0$I 1080 

AlTAACTTACGCTTTGCnCCCCACACCGTTTACCTGCGGTATTCTGTAAAGGATCTCCACGTAGCAAAGCTACACTTTnGCATCA^ 47520 

I L I R I A E G C R K G A T N Q L P D 6 R L L A Y S t A 0 A E V E 0 T P A Y I V 1040 

TAAGWATGCGTTCTCGAACGTn6GGATTTGACCCTGTC 47640 

YPIRERYRPIQGQRMVLKYYVTLHALRRHIYASPPRSLEA 1000 

GTC AT A AC A A ACH AnUT ATCC A ATTTGGGTG ATGT A ATCTGGCG ATGTGC ATCTGC A A TT ATGC6TCC A AACCCG6CC ATCCC AG AC6GC ATGGCCCGTCT A 47760 

TMYFKNIOLKPSTIQRHADAIIRGFGAMGSPMARRNIEAI 960 

6IAACACACGACGCCTCCGCCGCAGCACGCGAGACGGTGTCGTCATATAACAACAGTTCTACAAGTTTGCGGGCATAATC6TTAATAAATTGACAGTTGTTTTTTCTAACCAAGTCGACT 47880 

SYCSAEAAARSVTOOYLLLEVLKRAYONIFQCNNKRVLDV 920 

CCCnCATTAUKCmCCGCCGTAAAnACCCCAATGTACmnCTTTGTTATAAGCAAAAGTnTATAAAAGnmTCACACTCCAACmATAGGAGGACAAAACA 48000 

G r M L Y I G 6 Y I V G I Y % I I T I L L L It I F T t E C E L I I P P C F L A T 880 

6WAnATAT6TGCCATnTCTC6Ct6ArrnA6CTATCCCtTCUCACTAACACCCTTGAATCG6ATAAACACAGAATCCGTATCTCCATATATAACCT^ 48120 

SIIHAHKE6IKAIGEYSV6KFRIFYS0T06YIYKVEYAKQ 840 



368 
51960 



VZV DNA sequence 1779 

• ei,c«FUGGrHVLPSSAAPNLTRACNAARERF6FSRCQGP 128 

51240 

(Jt6TT6JLC0GT6CTDTT6^ "™ 

, ■ w F t M ¥ 6 G L D I Y H I N H 6 D Y I R I P I F P Y Q L F M P 0 V N R L V P 208 

, tt u^ 51480 

n p F N T N H R S 1 G E 6 F V V P T P F Y H T G I C H L I H D C V I A P U A V A 248 

GWHTmlcACTCUCACAGGTm 51600 

l«¥RN¥TAVARGAAHLAFDENHEGAYLPPDITYTVFQSSS 288 

TTGCGCGTC46AAATGTAACTGCCGTC6CCCGA6GAGCGGCCCACCTTK 51720 

tfiTITiRGARRMOVNSTSrPSPSGGFERRLASIMAADTAL 328 

AGTGGA ACC KT ACCGCCCGT6G AGC6CGTC6 A * AC6 ATGTC A ACTCC ACGTCTA AGCC T AGCCC A TCGGG6GGGTTTG A A A6AC6GTTGGCGTCT AT T ATGGCCGCTGACAC AGCCTTG 51840 

HAEVIFHTGIVEETPTOIKEiPMFIGMEGTLPRLMALGSY 
CACGCAGUGTTATATTCAACACTWAATTTACGAAGAAACTCCAACAGATATCAAAGAATGGCCAATCTTTATAGGC 

t i » v i G Y I 6 A H V F S P K S A L V L T E V E D S G M T E A t 0 G G P G P S 408 

ACCGCKGTGTGGttGGWTCATTGGTGCGATGG 52080 

eMfif¥OFlGPHLAAHP0T0RDGHVLSSQSTGSSNTEFSVO 448 

TITUTC™ 52200 

* i i i I r G F G A P L L A R L L F Y L E R C D A G A F T G G H G 0 A L t Y V T 488 
flniGttACTCATTTGT^ ^ 

r t f n <: F I P C S L C E K H T R P V C A H T T ¥ H R L R 0 R M P R F G 0 A T R 528 

6GGACCTTTGKTCTGAAATTCCATGTAGTTTATGTGAAAAACACAC 52440 

APtGVFGTMNSOYSOCDPLGNYAPYLILfllCPGOQTEAAKA 568 

chcctattgLtgtttggaacaatgaacagccaatatagcgac 52560 

f y 0 fl T V R A T L E R L F I 0 L E Q E R L L 0 R G A P C S S E G L S S V I V 0 608 

ACCATGCAGGACACTTATAGGGCTACACTA6AAC6CTTGTTTATC6ATCTAGAACAAGAK 52680 

u P T F ft ft I L D T L R A R I E Q T T T Q F M I V L ¥ E T R D Y t I R E G L S E 648 

CATCCAACGTTTCGTCKATATU^ 52800 

1 T H S y A L T F 0 P Y S G A F C P I T N F L ¥ K R T H L A ¥ ¥ Q D L A L S Q C 688 

* 1 " * TTCTCCCATTACCAATTTTTTAGTTAAACGAACACACCTAGCCGTGGTACAAGACTTAGCATTAAGCCAATGT 52920 



GCC ACCC ATTC * ATGGCGTT A ACGTTTG ATCCAT ACTC AGGAGC ATT 

h f « F ¥ G 0 Q V E G R N F R N Q F Q P ¥ L R R R F ¥ 0 L F N G G F I S T R S 1 728 

CATTGTGHTTTTACGGACAGCAAGTTGAGGGGCGGAACTTTCGTAACCAATT 53040 

T ¥ T L S E G P Y S A P N P T L G Q D A P A G R T F 0 G D L A R Y S Y E Y I R D 768 

ACCGTAACATTATCTGAAGGTCCTGTATttGCCCCAAATCCGACATTGGGACAAGACK 53160 

t fi ¥ t N R ¥ ¥ F S G M C T N L S E A A R A R L ¥ G L A S A Y 0 R Q E t R ¥ D H 808 

ITiCGAGTTAAAAATAGGGTCGTTTTTTCAGGTAACTGTACAAATCTCTCTGAGGCAGCCC^ 53280 

l H G i L G F L L t Q F H 6 L L F P R G M P P N S K S P K P Q « F M T L L 0 R N 848 

ttacacggWcctagggtttttgot 53400 

fl « P A 0 r L T H E E I T T I A A Y r R F T E E Y A A 1 N F I H L P P T C I G E 888 

tAGATGCCGGCAGATAAACTTACACACGAAGAGATTACCACTATTGCAGCTGTTAAACGGTTTACCGAGGAATATGCAGCAATAAACTn 53520 

L A 0 F Y M A H L I L t Y C 0 H S Q Y L I N T L T S I I T G A R R P R 0 P S S ¥ 928 

nAttCCAGnnATATGKAAATCTTATTCTTAAATACTGCGATCATTC^ 53640 

L H l I R t 0 ¥ T S A A 0 I E T O.A I A L L E t T E N L P E L V T T A F T S T H 968 
TT6CATTGGATICGTAAAGATGTCACGTCCGCCGCG6ACATAGAAACCCAAKAAAGKGCTTCTTGAAAAAACGGAAAA 

L ¥ B A A M N Q R P M V ¥ L 6 I S I S K Y H G A A 6 N H R ¥ F Q A G H « S C L H 1008 

TTAGTCCGCGCGGCCATGAATCAACGTCCCATGGTCGTnTAGGAATAAGCATTAGTAAATATCACGGAGCGGCAGGAAACAACCGCGTCTTTC "880 

RFI1ACPRGCFICPYTGPSSGNRET 1048 



CnntPlFIFOMR 



54000 



GGGGGTAAAAATGTUKCCGCTATnACATTTGATCGCACTCW 
TLS0O¥RGIIYSGGAMVQLAIYAT¥¥RA¥GARAQHMAF0O 1088 
ACCCTATCCGICCAAGTTCGCG6TITAATTGTCAGT6GCGG6GCCATGGnCAAnAGCCATATACGC^ 54120 
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■ L S L T D D E F L A R 0 I E E L H 0 Q I I 0 T L E T P R T V E G A L E A V I I 1128 

TGGTTAAGTCTTACAGACCATGAGTTTTTAGCCAGAGACnGGA^ 54240 

LDEKTTAGOGETPTNLAFKFDSCEPSHDTTSNYLNISGSN 1188 

CTAGATGUAAAACGACAGCGGGAGATGGGGAAACCCCCACAAACCTAGCATn^ 54360 

ISGSTVPGl«RPPEODELFDLSGIPI«HGHITIIEMI- 1204 

ATTTCAGGGTCAACTGTCCCTGGTCTTAAACGACCCCCCGAAGATGACGAACTCTTTC^ S4480 

* 

TTATCCAATTAAAGCCCACACGCG6GTGAGTGTKGTAATMACAAGTC 54600 

MELOINRTLLVLLGQVYTVIFQV 23 

GTCTCCUCCATTCAGCTTACAGTCCA>TGGA^ 54720 

E L L R R C 0 P R V A C R F L Y R L A A N C I T V R Y L L t L F L R G F N T Q L 63 

A ACTGC T ACGTCGAT6TG ATCCA AGGGTGGCGT6TCGCn THAT ATCW^ 54840 

t F G H T P T V C A L H « A L C Y V R 6 E 6 E R L F E L L Q H F I T R F V Y G E 103 

AATTTGGAUCACTCCCACGGTTT6TGCACTGCATTGGGCATTATGTTATGT 54960 

TKDSNCIKDYFYSAFNLICTCOYHHELSLTTVGGYISSEIQ 143 

CTAAAGACTCAAACTGTATCAAAGAnACTn6TCTCAGCGnTAACnAUA 55080 

FLHOIENFLKQLNYCYIITSSREALHTLETVTRFKTOTIG 183 

TTTTACACGACATTGAGAATTTnTAAAACAGCTTAATTACTGCTATATTATCACGTCTTCTCGTGIGGCGCTAAACACATTGGAAACCGTGACGCGGTTTATGACAG 55200 

SGLIPPVELFDPAHPCAICFEELCITANQGETLHRRLLGC 223 

GCGGTCTAATACCACCCGTGGAGnGTnGATCCGGCGaTCCATGTGCTATATGTTTTGAA6AATTATGTATAICAGCTAACCAAGGTGAG^ 55320 

ICDHVTKQVRVNYOYODIIRClPYIPDVPDirROSAVEAL 263 

TCTKGATCACGTTACTAAGCAAGTTCGGGTTAACGTGGATGTTGACGAU^ 55440 

R T L Q T IC T V V N P M G A t H D T F D 0 T Y E I A S T M L 0 S Y N V F I P A P 303 

GAACACTTCAAACCAAGACGGTAGTCAATCCCATGGGAGCAAAGAACGATACGTTTGACCAAACATACGAAATTGCGAGCACCATGCTT6ATTCTTATAATGTTTTTAAACCTGCCCCTC 55560 

RCMVAISELKFWLTSNSTEGPQRTLDVFVDNLOYLNEHEK 343 

GGTGTATGTACKCATCAKGAGCTTAAATTCTGGTTAAC6TCTAATO 55680 

HAELTAVTVELALFGKTPIHF0RAFSEEL6SLDAIDSILV 383 

ACGC AG A AC TT AC AGCCGT AACGGTTGAGTTGGCGTT AnTGGAA AA ACTCCC AT AC ACT TTG AT AGGGCGT TTTC TG A AG AACTCGGATCTCTG6AT6C AATT6AT A 55800 

GMRSSSPDSQIEALIKACYAHHLSSPLMRHISHPSHOHEA 423 

GCAATCGCTCATCCTCACCAGACAGTCAGATAGAAGCATTAATTAAAGCCTGTTATGCCCATCATCTATCGTCGCCTCTCATGCGTCACATTTCTAACCCGAGTCATGATAACGAAGCCG 55920 

ALRQLLERVGCEODLTKEASDSATASECOLMOOSSITFAY 463 

CCTTACGCCAACnTTAGAUGAGnGGGTGTGAGGATGATTTAACCAAAGAGGCGAGTGACAGCGCTACAGCATCCGAATGTGATCTGAACGATGATAGTAGCAT 56040 

HGIERLLSKAIIDAAERIRVVLEHLSKRSLTSLGRCIREO 503 

ATGGATGGCAUACCTGTTATaAUGCAAAUTTGACGCTGCGGAAAGAAAACGAGTATATCTTGAACATCTGTCTAAGCGCTCTCTUCCAGCCTCW 56160 

R Q E L E t T L R V N V Y G E A L L Q T F V S M 0 N G F G A R N V F L A It V S Q 543 

WCAAGAGCTA6AAAAAACACTCAGG6TMACGTTTATGGAGAGGCCn^ 56280 

A S C I I D N R I Q E A A F D A H R F I R H T L V R H T V 0 A A M L P A L T H I 583 

CAGGGTGTATTATCGACAATCGCATKAGGAAGCGGCCTTTGAT6CACATAGATTTATAAGGAATACCTTAGTTCGACATACAGTAGATGCGGCTATGTTACCTGCACTTACA 56400 

FFEtVNGPLFNHDEHRFAQPPHTALFFTVEHVGLFPHLKE 623 

TTTTTGAGTTGGTC A ACGGCCC ATT6TTTAATCACGATGA ACACCGTTTTGC AC A ACCCCC T AACACCGCCTT ATT TTTT ACCGTGGAAAACGT TGGCCTATTTCCGC ACTTAA AAGAGG 56520 

ELAKFNGGVV6SNILLSPFRGFYCFSGYE6VTFA0RtAiK 663 

AATTGGCAAAGTTTATGGGCMTGTCGTTGGnCCAACT6GCTTCTCAGTCCATTTAGGGGCITTTATTGCTTTTCTG6G0TAGAAG^ 56640 

YIREtVFATTLFTSVFHCGEVRLCRYDRLGrOPRGCTSQP 703 

ATATTAGGGAGCTTGTGTnGCAACCACACTATTCACCTCTGm 56760 

KGIGSSHGPLDGIYLTYEETCPLVAI I QS6ETGI00HTYV 743 

AAGGTATAGGCAGnCCCACGGACCCTTAGACGGCAnTATTTAAC6TACGUGAAACAT6TC^ 56880 

IYDSDVFSLLYTLM0RLAPOST0PAFS- 770 

TCTACGATTCAGACGTTTTTTCTCTTCTATACACCCTAATGCAGCGGCTGGCTCCGGATTCAACGGACCCGGCGTTTTCATAACCTCCGTTACGGGGGTGTGGT 57000 

MFVTAVVSVSPSSFYESLOVEPTQSEOITRSAHLGOGD 38 

ATTTTCTATGTTTGTTACGGCGGTTGTGTCGGTCTCTCCAAGCTCGTTTTATGAGAGTTTACAAGTAGAGCCCACACAATCAGAAGATATAACCCGGTCT6CTCATCT6GGCGATGGTGA 57120 



VZV DNA sequence 1781 

f I R E A I H K S Q 0 A E T r P T F Y Y C P P P T 6 S T I ¥ R I E P T R T C P D 78 

TGUITUMG^ 57240 

vu!GriiFTE6IAY¥VKENIAAYKFKATYVYKDYlYSTAMA 118 

TTATCAttTTG^ 57380 

r. < s V T 0 I T N R Y A 0 R V P I P V S E I T D T I D t F G I C S S t A T Y ¥ R 158 

CGGAIGnCTTATACGCAUTTACTA^ 57480 

It H H r Y E 4 F N E D t H P Q 0 H P L I A S t Y N S Y 6 S I A I H T T N D T Y M 198 

IHTUCC^ 57600 

V A 6 T P G T Y R T 6 T S V N C I I E E V E A R S I F P Y 0 S F 6 L S T 6 0 I I 238 
(JGTTKCGGAACCCCCGGAACATITAGGACGGGCACGTCK^ 57720 

Y M S P F F G L R D G A Y R E H S H Y A M D R F H Q F E 6 Y R 0 R 0 L 0 T R A L 278 
UAeATGTCCCCGTTTmGGCCTA^ 57840 

L E P A A R N F L Y T P H L T V 6 » H ■ r P K R T E Y C S L V K ■ R E V E D Y Y 318 

iCTGGUCCTGCAGCGCGGAACTTTTTAGTCACGCCTCATnAACGGTTGGn 57960 

R D E Y A H N F R F T M I T L S T T F I S E T N E F N L N Q I H L S Q C V It E E 358 

TCaGHGAGTATGCAC 58080 



. d . t i h R I Y T T R Y N S S H V R T G 0 I Q T Y L A R G 6 F V V V F Q P L L 398 

AGCceG^TmmAAcc^ 58200 



SIISLARLYLQELVRENTNHSPQKHPTRNTRSRRSVPVEIR 438 

gaLattccc™ 58320 

i U R T I T T T S S ¥ E F A H L 0 F T Y 0 H I Q E H Y N E M I A R I S S S M C Q 478 

TGCCAATA6AACAATAACAACCACCTCATCGGTGGA 58440 

L 0 H R E R I L « S 6 L F P I H P S A L A S T I I D Q R Y t A R I I G 0 ¥ I S V 518 

aTACAAAATCttGAACK^ 58560 

S « C P E L 6 S 0 T R I I L Q N S « R ¥ S G S T T R C Y S R P L I S I ¥ S L H 6 558 

TTCTAAnGTCCAGAACTGGGATCAGATACACKATTATACTTC^ 58680 

S6TVEGQLGT0HELIMSRDLLEPCYAMHKRYFLF6HHYYV 598 

GTCCGGGACGGTGGAGGGttAGCnGGAACAGATAACGAGTTAATTATGTCCAGAGATCTGTTAGAACCATKGTGa 58800 

YEDYRYYREIAYHDYGMISTYYDLMLTLLKDREFMPLQYY 638 

TTlTGAGGATTATCCnACGTCCGTGAAATC^ 58920 

T R D E L R D T G L L D Y S E I 0 R R N fl M H S L R F Y D I 0 I Y ¥ 0 Y D S G T 678 

TAClAGAGACGAGCTGCG6GATACAGGAnACTAGACTACAGT6AAATTCAACGCCGAAATCAAATGCAnC6CTGC6n 59040 

i I H 0 C H A 0 F F 0 6 L G T A G Q. A ¥ G H ¥ ¥ I 6 A T 6 A L L S T ¥ H 6 F T T 718 

GGCCATTATGCAGW^^ 59160 

F L S N P F 6 A L A Y 6 L L V L A 6 L Y A A F F A Y R Y ¥ L t L I T S P H t A L 758 

6TTTTTATCTAACCCATnGG6GCAT 59280 

y P I T T K G L K Q L P E G M 0 P F A E K P N A T 0 T P I E E I 6 0 S 0 N T E P 798 

4TATCCACTCACAACCAA6GGGnAAAACAGTTACCGGAAGGAATGGATCCCTTTGCCGAGAAACCCAACGCTACTGATACCCCAATAGAAG 59400 

S V N S 6 F 0 P D I F R E A Q E U I It Y U T L ¥ S A A E R Q E S t A R I I N t T 838 

GTCGGTAAATAGCG6GTnGATCCCGATAAAnTCGAGAAKttAGGAAATC^ 59520 

SALLTSRLTGLALRNRRGYSRYRTEKYT6V-. 868 

TAGC6CCCTnTAACTTCACGTCTTACCGGCCTTGCTTTACGAAATCGCCGAGGATACTCCCGTGTTCGCACCGAGAAT6TAACGGG6GTGTAAATA6CCA6GG^ 59640 

TTAATAAAAATGTGTAmCGTTACTCATGTGTCT^ 59760 

yESSNINALQOPSSIAHHPSIQCASSLNETVKDSPPAI 38 

TT ATT ATGG AATCGTCT A AC ATT A ACGC6CT AC A AC A ACC6TCGTCT ATCGC AC ATC ATCCGTCC AAAC AGTGCGCTTC AAGTCTC AATG A AAC ACT A AA AGATTCTCCttCCGCGATTT 59880 

¥ E 0 R L E H T P Y Q L P R 0 6 T P R D Y C S Y 6 Q L T C R A C A T K P F R L M 78 

AT6U6ATAGGTTAGAACACAC6CCGGTACAATTA(XCCGC6ACGGTACACCCCWGAC6TATGn^ 60000 

R D S 0 Y 0 Y L N T C P 6 6 R H I S L A L E I I T 6 R « ¥ C I P R Y F P 0 T P E 118 

GCGACAGCCAATACGACTACTTAAACACATGTCCAGGGGGttGTCATATTC 60120 
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EK»MAPVIIPOREOPSS0DEOSDTO- 

mmGGiTcceeccmTAnmcacAC^ ^ 

^ G™TTTAACACGACACCGCm^ ^ 

CRG. VHMQNAFMO 594 
CCDICOrCTKCCICMCCCCTCTjm^ ^ 

«s'simiiininii(£ninHiiM(ifMnin $14 

WCTCCIGT^ 

' t I II I I I II t I I I t 1 1 f I S) I H I 1 I II i f S H I I I H I I II 474 

6ciic»cccM»iumji»niMw«uw 

'WMHSPFiHIIliEf ilf iUilMliUOIUHI 434 

JClIC6Gmi6ICI«U«ITTttK«^ 60960 

' u ' ' 0 *■ f " » C P I 0 I D I G R E 4 » L * N M F 4 S II 0 II E I R N P P P Y 6 394 
K »UCCC6mGCICCCCCGUC^ 

rr»Ti"BBFH6PPC6y6GTNPPF»PPHPS6»PG»Ir06NIIP 354 

arccmiGn»ccT6ca*mnAGici^ 

«biiIBPII*IQGTl.l.0Sr»*TPILVFEGGTPV6SPPQPSI 314 

6,3» 

TCGr.IJTG^CCGCC^TGGUIGIG^m^^ 61440 
I*MCGTGACCCU»»TICC6KITC«^ 

i»N6»^6*ERRRI4VV0l(RORLlMNIIV»TSlLVt»L»OES 194 

{""""J™""?'^^ 616M 
»«l»i:ilL6»VDCVlSl.LR»ltSEI(SlVRFPEIS$EPTC0rH 154 

r»I6VRR6VVCll»HTFFNG0PIENPSLRISSlS»SPirH 114 

G1G«CC»GT«TIKGC*CGTICI»GCGG«»T«r«» $m) 

''Lit»iiELPSiyS0RHGFFHSH«*EFL»lHL0PCRVI6L 74 
««M»GGICtGCGIJIGTCCIC«JT6»I^ 

G R I 0 E I I 4 I V E G » V C 0 t R H 0 I N I P I I S T P P I 1 $ R » l E 34 

GGGGTA»T»mmiCCCCITCGTCTO^ 62160 
inLtbfcDKSTLALYGAVYLAECNEEDAEAAM t 

TAGTGAATACTCACGCTGTG6AGGCAAACT6ACGAATGACACCCAAACAGACAAAATATW 

-PTSAFQPIVGLCVFVLLOYDSNVHF. QALRRSEELAA 544 
CCCTTACAC<^ 

R V R P 0 H M L « G L G R E Y 0 L M V S D H I S ¥ V A A I P L S T V N 0 R R H E 504 

CTAATTCCAATGCAATTAAGCnATGAGCCGATCTTGGTACTGTCCAGAAGAAATATCTATTACG^ ^ 

LELAILSILRD0Y0GSSIOIYTRGLS6RSLQTHA0LCLAV 464 
CTAATCCAGGmCUTTCAGTCAGCTC 

LGPFMETLETTRNLTYLPAIYRECLFQFNNMGSKFKG0F6 424 

CCGTCGCT^ 

lAwnLIVNNTPTGTAESVILGLQFNNKLICDGYVHGNIILLR 384 

^ T A AT A ATAA TAACGCCGTA A TCCTGTC AA TTG TT A ACCTT A A T AG AGT? TGCTC TTCC ATAAGA AAC ACGTTTTGGGCCCGT TC T A A ATttCGCCGCGGCCGCCTGTTGUTCT TGT 628S0 

RLLLLATITDITLRLLTQDEMLFVNQAREIYAAAAQQIKD 344 

CCACATATKGGTA^ ^ 

»T*THNROIIOHYGPNFQGPAPPLYQLRILADMVSHSTLA 304 
CaCHGATCATAC^ 

6 0 0 Y R G G R A H F 0 0 R P L R D Y C F 0 V S N t N E H T T N Y N Y F T E T Y 264 




VZV DNA sequence 



1783 



ATTCCGGTAGGTGCTCTATAAGGTTCCCCAAGGACGAAACTTGAGC^ 63240 

EFLHEILN6FSSVQPEHVSNSTRYMLYLCLVACEFASYAR 224 

TATCTCCAACATACAnCTCCCGGCGGAaGTAGACATGTTACCG^^^ 63360 

DDYYMB6AS0LCTVTTNMFTRSIR6DKCDV6RAAVPRDLV 184 

CJAGCCGTTCCTGUGAGTACGATACCATGGCCCGAAAACAATCCCTGGAGAGnAU 63460 

IREQLTRYiPGFVIGPSNNGRARGLYVLTIFDVQFNTDYQ 144 

KllCG6TGCATCAnTTT6GCAATCTGTACCTCGGGGTGT 63600 

LPAONtAIQYEPNI SENRI IETRTCEESQADAEEAEAAAT 104 

TCTCCAGGGAATCCAAAACCTTGGCCATGC6C6TTAGnGTTCTTCGAGGGGCTnAAACGACGATCTATTTCC6nGGTAACGTAATCGnTCCCCGCGAAGGT^ 63720 

EISOLVIAMRTIQEELPKLRROIETPLTITEGRLNOLA AV 64 

CWCCCCCGCATTTTTTAACGTTAACGTATTTTTTTCCAAATC6GGATTCATACGCCCTCTTAACTCAAACGCGCGAGCCGTCCAGTA6T6TATGG6GAAGTTGGGGGCTATAAAGTTCT 63840 

AAANKLTLTNKELOPNMRGRLEFAPATITYKIPFNPAIFNK 24 

TASTGGTAGACAAIUTATCCCACAnTATTCGGAAACGAGATAGATCCGAACCCATATCTC^ 63960 

TTSLFIGCXNPFSISGFGYRATH 1 

T ? T A A AAGCTGTT TTCTT ACCC ATGGG A A A AC ATCCCGGTT ATACTTTGT A AA ATTCC ACC ACA AGCACCTA A AGAAGGCCHCTAAGGGGTAA ATCC ACCCC AC A AGCTGC ATTTTCT^ 64080 

-GHSFMGTISQLIGGCAGLSPRRLPLOVCCAANEE 225 

€ AAAC TTTGTTA A AGCGGA ACGATGGC A TGATTTCGC ACGC TTTTTCGC A AG AG A AC AT ACGTGAATnTCTTTTTGC ATAGACGTC TTCGCTCTCTA ACGG ACCTT ATCGGGGGGGTAT 64200 

FITLASRHCSKARKKAISCVHIKKKCLRRRERVSRIPPTY 185 

ATTCCGCTACATTCTCCAAATGCGACGCTAGCATAACAAGGTTTCCATGAATCACCT^ 64320 

EAVHELHSALHVLNGHIVKPPLRTVQLLNLGRQSVFVLLP 145 

»>» 5' end of dPyK nRNA 

GGCTCACCATTATTTCATCAGATCCCGTGWTGTGGTTTCCTTjmAAAGCCATGGTATCCCTCAGCT^ 64440 

TVMIEDSGTPTTEKILAMTDRLQRMGECFQHYKTPTHINA 105 

CGCUAAACGGCAIGATTTTAATTCCAC TATAAAA CAAACGGTCTnCCGGCACCACTGGATTCCGTTT GTATAATA CAAACACAATCG6G6CGTCGGC6TCCC 64560 

SFRCSKLEYIFCVTKGAGSSETQIICVCDPRRRGLNVEFS 65 

ACiTTGiTIT«GTAClGCCCTTTGAACATCC4CGTGGGITAACGGC6ACAGGAGTTTTGCCAGCCTCGGGTTGAACGCGTCCCCGAAACCTC6ACGTACGTTATCAATATCCTTnTGA 64680 

KSIRVARQVOVHSLPSLLICALRPNFAOAFGRRVNDIOKKL 25 

ACCI 

6TKATCGTAAAAACGAGT6TGGCAACGTTGTCCCAAACGAAAACACTTGGCCCGAATTCGACTAGCGGACATATTTGAAGTTCCGTCCCAGAAGATAACCTAAGACGCGTTTGTCTACA 64800 

VDYFRTHCRQGLRFCKARIRSASM 1 

ttSTOKTOVKMGVLRIYLOGAYGIGKTTAAEEFLHHFAI 38 

ATlAACAtGTCAACGGATAAAACCGATGTAAAAATGGGCGTTTTGCGTATTTATTTGGACGGGGCGTATGGAAnGGAAAAACAACCGCCGCCGAAGAATTnTACACCAC 64920 

TPNRILLIGEPLSYNRNLAGEDAICGIYGTQTRRLNGDVS 78 

AClCCAAACCGGATCTTACTCATTGGGGAGCCCCTGTCGTATTGGCGTAACCTTGCAGG6GAGGACGCCATnGCGGAATTTACGGAACACAAACTCGCCGTCTTAA 65O40 

PE0AQRLTAHFQSLFCSPHAIMHAK1SALM0TST5DLVQV 118 

CC IGA AGACGC AC A ACGCC TC ACGGCTC ATT TTC AGAGCC TGTTCTGT TC TCCGC ATGC A ATT ATGCATGC6AAAATCTCGGCATTGATGGAC AC A AGT AC ATCGG ATCTCGT AC A ACT A 65160 

NKEPYKIHLSDRHPIASTICFPLSRYLVGDMSPAALPGLL 158 

AATAAWAGCCGTATAAAATTATGTTATCCGACCGACACCCAATCGCCTUACTAm 65280 

FTIPAEPPGTNLVVCTVSLPSHLSRYSKRARPGETVNLPF 198 

TTTACGCTTCCCGCTGAACCCCCCGGGACCAACTTGGTAGTTTGTACCGTTTCACTCCCCAGTCATTTATCCAGAGTAAGCAAACMGCCAGACCGGGACAAACKTTA 65400 
Xul 

VKVLRNYYIMLINTIIFLKTNNNHAGNNTLSFCNDVFKQK 238 

GTTAfGGTTCTGAGAAATGTATATATAATGCTTATTAATACAATTATATTTCTTAAAACTAACAACTGGCACGCGGGCTGGAACACACTGTCATTTTGTAATGATGTATTTAAACAGAAA 65520 



t 0 I S E C I I L R E V P G I E 0 T L F A V L t L P E L C G E F G N I L P L « A 278 

HACIAAAATCCGAGTGTATAAAACTACGCGAAGTACCTGGGATTGAAGACACGTTATTCGCCGTGCTTAAACTTCCGGAGCTTTGCGGAGA6TTTGGAAATATTCTGCCGTTATG&GCA 65640 

IGMETLSNCSRSHSPFVLSLEQTPQHAAQELCTLLPQMTP 318 

TGGGG1ATGGAGACCCTTTCAAACTGCTCACGAA6CATGTCTCCGTTCGTATTATCGTTAGAACAGACACCCCAGCATGCG6CACAAGAACTAAAAACTCTGCTACCCCAGATGACCCCG 65760 

AKHSSGAVNILKELVNAVQONTS- * 341 

6CAAACATGTCCTCCGGTGCATG6AATATATT6AAAGAGCTTGTTAATGCC6TTCAGGACAACACTTCCTAAATATACCTAGTATTTACGTATGTACC AGTAAA AAGATGATACACATT6 65880 



»»> 3' end of dPyK tfttu 

TCA?ACTCGC6TGTACGTGTTTTTCTTTTTTATATATGCGTCATTTATTACCACATCCTTTAATCCCGCCTTTATCTCCCTAAAAC6GAGTGGTAATATTAAAA6CCGCCAAGCCT6nG 66000 
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MFALVLAVVILPLITT 16 

GTGGGTGAGGAGCGGTAAAimCGCTCTGTGCATAACGTTGC^^ 66120 

ANKSYVTPTPATRSIGHMSALLREYSDRNMSLKLEIFYPT 56 

GGCTAjlTMTCTTAC6TAACACCAACCCCTGCGACTCGCTCTATC^ 66240 

6F0EELirSLHIGH0RKHVFLVIVtYHPTTHEG0V6LYIF 96 

TGGTTTC6ATGAA6AACTCATTAAATCACTTCACTGG6GAAATGATAGAAUCACGTO^ 66360 

PKYLL$PYH l ( FCAEHRAPFPA6RFGFlSHPVTP0Y$FF0SS 138 

1CCAAAATACTTGTTATCGCCATACCATTTCAAAGCAGAACATCGAGCACCGTTTCCTGCTG<iACGTTTTGGATTTCITAGTCACCCTGTGACACCCGACGTGAGCTTCTTTCACAGTTC 66480 

FAPYLTTQHLVAFTTFPPNPIVVHLERAETAATAERPF6V 176 

GTTTGCGCCGTATTTAACTACGCAACATCTTGTTGCGTTTACTACGW 66600 

SLLPARPTVPKNTILEHKAKFAT1DALARHTFFSAEAIIT 216 

AA6 TC TT TTACCCGC TCGCCCA AC AGTCCCCAAGA AT ACT ATTCTGGAAC A 66720 

NSTLRIKVPLFGSVNPIRYVATGSVLLTSDSGRYEVNIGV 256 

CAACTCAACGnGAGAATACACGTTCCCCnTTTGGGTCGGTATGGCCAA^^ 66840 

GFHSSLISLSSGPPIELIVVPHTVKLNAVTSDTTVFQLNP 296 

AGGATTTATGAGCTCGCTCATTTCmATCCTCTGGACCACC^^ 66960 

PGP0PGPSYRVYLLGR6LDHNFSKHATVDICAYPEESL0Y 336 

ACCGGGTCCGGATCCGGGGCCATCTTATCGAGTTTAn 67080 

RYHLSMAHTEALRKTTKADQHOINEESYYHIAARIATSIF 376 

CCGCTATCATTTATCCATGGCCCACACGGAGGCTCTGCGGATGACAACGAA 67200 

ALSENGRTTEYFLLDEIVDVQYOLKFLNYILMRIGAGAHP 416 

TGC6TT6TCGGAAATGGGCCGTACCACAGAATATTTTCTGTTAGAT6AGATK^ 67320 

NTISGTSDLIFADPSGLHDELSLLFGQVKPAMVDYFISYD 456 

CAACACTATATCCGGAACCTCGGATCTGATCTTTGCCGATCCATCGCAGCTTCATGACGAACTnCACTTCnTTTGGTCAGGTAAAACCCGCAAATGTCGA 67440 

EARDQLKTAYALSRGQOHVNALSLARRVIHSIYKGLLVKQ 496 

TGAAGCCCGTGATCAACTAAAGACCGCATACGCGCTTTCCCGTGGTCAAGACCATG^^ 67560 

NLHATERQAIFFASMILLNFREGIEMSSRYIDGRTTLLLM 536 

AAATTTAAATrcTACAGAGAGGCAGGCTTTATTnnHXTUATGAn 67680 

TSMCTAAHATOAALNIQEGIAYLNPSKHMFTIPNVYSPCM 576 

GACATCCATGTGTACGGtAGCTCACttCACGCAAKA^ 67800 

CSLRTDLTEEIHVMNLLSAIPTRPGLNEVLHTQLOESEIF 616 

GGCTTCCCTTCGTACAGACCTCACGGAAGAGATTCATGnATGAATCTCCTGTCGGCAATACCAACACGCCCAGGACTTAACGAGGTATTGCATACCCAACTAGACGUTCTGAAATA 67920 

OAAFKTMMIFTTNTAKDLHILHTHVPEYFTCQDAAARNGE 656 

CGACGCGGCATTTUUCCATGATGATmTACCACATGGACTGCCAAAGATTTGCATATACTCCACACCCATGTACCAGAAGTATTTACGTGTCAAGATGCAGCCa^ 68040 

YVLILPAVQGHSYVITRNKPQRGLVYSLADVOVYNPISVV 696 

ATATGTGCTCATTCTTCCAGCTGTCCAG^ACACAGTTATGTGATTACACGAAACAAACCTCAAAGGGGTTTGGTATATTCCCTGGCAGATGTGGATGTATATAACCCCATATCCGTTGT 68160 

YLSROTCYSEHGYIETVALPHPONLKECLYCGSVFLRYLT 736 

nATTTAAGCAGGGATACTTGC6TGTCTGAACATGGTGTCATAGAGAC6^ 68280 

TGAIMOIIIIDSKOTERQLAAMGHSTIPPFHPOMHGDOSr 776 

CACGGGGGCGATTATGGATATAATTATTATTGACAGCAAAGATACAGAACGACAACTAGXCGCTATGGGAAACTCCACAATTCCACCCTTCAATCCAGACATGCACtGGGATGACTCTAA 68400 

AYLLFPHGTVVTLLGFERRQAIRNSGOYLGASLGGAFLAY 816 

GGCTGTGTTGTT6TTTCCAAACGGAACTGTGGTAAC6CTTCTAGGATTCGAACGACGACAAGC^^ 68520 

VGFGIIGVNLCGNSRLREYNKIPLT* • 841 

AGTGGGGTTTGGTAnATC66AT6GATGnATGTGGAUTTCCCGCCnCGAGAATATAATAAAATACCTCTGACATAAAAAACATGTATAATAAAAAGTCACTATAAACGTATTCK 68640 

CAATACTTTATTCGCGAATAATACACACTACCTnGGGTTTTTTTCCCGTCCCCAAATGGTGTTTGGTGCACTCTACCAAAAAATAGAGCGCCTAAATATK 68760 

-RQTKKGOGFPTQHVRGFFLAGLYAIVRRCA 512 

AAAATACGGTTCAAAGGCATTACCCGATATTGTATTGTAGTACAGGGCAATGGCAATTGATGATCCCAATAAACGGCATAGACGCACAGCGCCGTTATAGCA 68880 

FYPEFANG5ITNYYLAIPISSGLLRCLRVAGNYCP0GSYL 472 

GGTATCTAAGTACCGGGATATCTCATACTCATGCCTTTCCGTGACAGAAACATCAACCGGAACAGTATCCGATAAACCAACTCCTGTTTTTGCAAGGCGTAAAAnCGCA 69000 

TOLYRSIEYEHRETVSVOYPVTDSLGYGTCALRLIRYGEK 432 



1785 

VZV DNA sequence 



necurcmauTcucc™^ ~ 

* 240 

.Us^eo!^ 7MW 



1786 
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QLL6VLLEKAPPLSLLSPINCFQFE6HLRRVARA. ALLSDL 234 

TCAACTTCTTGGT6TTCTATTG6AGAAAGCCCCACCGCTATCGCTGCW 72240 

KRRVCADMFFMTRHAREPRLISAYISDNVSCTQPSVMVSR 274 

CAAACGTAGAGTCTGT6CCGATATGTTTTTTATCACCCGACACCCCACGGAACCTAGGCTGATCTCTGC6TATCT6TCG6ATATW 72380 

ITHTNTRGRQVDGVLVTTATLKRQLIQGILQIODTAAOVP 314 

AATAACTCATACAAACACTCGCGGACGGCAGGTTGACWTGTGTTGGTAACAACAGCAACCTTAUACGWAACTAnACAGGGAAm 72480 

VTYGEMVLQGTNLVTALVHGKAVRGMODVARHLLOITOPN 354 

AG T A ACAT ATGGCG A A ATGGTTCT AC AGGGGACA A ACTTGGT AACCGCCCTTGTGA TGGGA AAGGCCGTCCGCGGAAIGGATGATGT AGCCCGCC ATCTCCT TGAT A TA ACCGACCC T A A 72600 

fLNIPS : IPP0SNSOSTTAGLPI«ARVPADlVIVCOrLVFL 394 

C ACGTT A AAC AT ACCGTCTATACCCCC AC AA TCCA ACTCCGA TTC A ACGACAGC TGGGCTTCCGGTT AACGCCCGTGTTCCTGCGGA TTT A6TGATTGT TGGGGAT A AACTTGTA TTCTT 72720 



EALERRVYQATRVAYPLIGHIOITFIMPMGVFQANSMDRy 
AGAAGCATTAGAACGGCGGGTCTACCAAGCTACGCGCGnGCCTACCCTCTTATTGGAAATATAGATAT^^ 



434 
72840 



TRHAG0FSTVSEQDPRQFPPQ6IFFYNKD6ILTQLTLRDA 474 

TACACGACACGCCGGCGATTTTTCAACTGTATCC6AACAGGATCCACGTCAAU^ 72960 

M G T I C H S S L t 0 V E A T L V A L R Q Q H I 0 R Q C Y F G Y Y V A E 0 T E 0 514 

AATGGGTACCATCTGCCACAGTiCATTGCTTGATGTCGAGG^ 73080 

TLDVQMGRFNETMADMMPHHPHVVNEHLTILQFIAPSNPR 554 

CAC ATTGGATGTTC AA ATGGGGAGGTTT A TGGA A ACGTGGGC AG AT ATGA TGCCTC ATC ACCCTCATTGGGT AA AC6AACATTT A AC AATTCT AC AGTTT A T AGC TCCGAGC A ACCCGCG 73200 

LRFELNPAFDFFVAPG0V0LP6PQRPPEAHPTYNA*TLRII 594 

TC T A AGGTnGA ATT A A ACCCCGCCTTTG ATTTT THG TTGC ACCGGGGG ACGT AGACCTTCCCGG ACCGC AGCGTCCCCC6GAAGCC A TGCCAACCGTTAACGC AAC ATT ACGGAT T AT 73320 

N6NIPVPLCPISFRDCRGTQLGLGRHTMTPATIKAVKDTF 634 

CAACGGAAACATTCCCGTGCCTCTATGTCCCATTTCAnTC6AGACTGTCGCGGAACCCAACTC66TnGGGAA6ACATACAAT6ACCCC6GCAACCATTAAAGCCGTAA 73440 

EORAYPTIFYMLEAVIHGNERNFCALLRLITQCIRGYWEQ 674 

TGAAGACCGCGCATACCCAACTATTTTCTACATGCTAGAGGCTGTTATTCATGGAAACGAAAGAAACTTCTGTGCGTTACT6CGACTGTTAACACAGTGTATTCGCG^GTATTGGGAGCA 73560 

SHRYAFVHNFHMLMVITTYLGHGELPEYCINIYROLLOHV 714 

ATCCCACAGGGTGGCATTTGTAAATAACTTTCACATGTTAATGTACATAACTACATATCTCGGAAACGGTGAGCnCCCGUGTCTGTATTAATATATATCGWATTTACTGCAK 73660 

RALRQTITDFTIQGEGHNGETSEALNNILTDDTFIAPILM 754 

AAGAGCATT ACGCC A A ACT ATA ACCGA TTTT AC AAT AC A AGGAG AGGGCC AT A ACGGCGAGACCTCGG A AGCGCTAAATAACATCCTT ACGG ATG AC ACGT TT ATTGC ACC T A TTCT ATG 73800 

DCDALIYROEAARORLPAIRVSGRNGVQALHFVDNAGHNF 794 

GGATTGTGATGCGTTAATATACCGTGATGAAGCCGCCCGAGACCGACTCCCCGCAATTCGTGTAAGCGGGCGAAACGGATACCAAGCCCTTCACTnGTGGATATGGCCGGGCATAACTT 73920 



QRRDHVLIHGRPVRGDTGQGIPITPHUDREIGILSriVYY 
CCAACGACGCGATUTGTGTTAATCCACGGGAGACCCGTTCGGGGAGACAC^ 



834 
74040 



IVIPAFSRGSCCTMGVRYDRLYPALQAVIVPEIPADEEAP 874 
TATTGTCATTCCTGCATTTTCCCGCGGTTCCTGTTGTACAATG^ 74160 



TTPEOPRHPLHAHQLVPHSLNVYFHRAHLTVOGDALLTLQ 
AACTACCCCA6AAGATCCAAGACACCCTCTTCACGCACACCAACTCGTTCCGAACTCTCTTAACGTTTACTTCCATAATGCACACCTAACCGTTGATGGTGATGCATTGCTCACACTACA 



914 
74280 



ELMGDMAERTTAIIVSSAPDAGAATATTRNMRIYDGALYH 954 
AGAGT TAA TGGG AG AT A TGGC T6AACG AACGACGGCCATTTT AGTATC AAGCGCCCCCGATGCGGGAGCCGCUCGGCAACAACC AGA AAT ATGAGA A TAT A TGAC6G AGCGC T TTACC A 74400 



GLIHHAYQAYDETIAT6TFFYPVPVNPLFACPEHLASLRG 
T66CCT T ATT ATGATGGC A T A TCAGGCGT AC6ATGA A ACC AT TGCA ACGGGTACTTTTTTTT ATCCCGTTCCGGTCAACCCTCTGTTTGCATGTCCGGAACAT TT6GC 

UTNARRVLAKHVPPIPPFLGANHNATIRQPVAYHVTKSKS 
AATGACAAATGCTAGGCGGGTTTTGGCAAAAATGGTACCACCAATCCCTCCTTTTCTGGGAGCCAACCACCACGCAACTATACGCCAACCtGTTGCCTACCATG 



994 
74520 



1034 
74640 



DFNTLTYSLLGGYFKFTPISLTHQlflTGFHPGIAFTVVRQ 1074 
CGATTTTAATACTCTTACATATTCTCTTCTTGGAGGGTATTTTAAGTnACACCAATATCTCTTACACATCAACTACGAACGGGATTTCACCCCGGGATTGCCTTTACCGTAGTGCGCCA 74760 



0RFATE0LLYAE-RASESYFV6QIQVHHHDAIGGVNFTLTQ 
GGATCGCTTTGCCACAGAGCAACTTTTATATGCCGAGCGTGCTTCTGAATCGTACTTTGTCGGACAAATCCAAGTACACCAnATGATGCTATTGGGGGGGTAAACTTTAC 

PRAHVOLGVGYTAVCATAALRCPLTOMGNTAQNLFFSRGG 
ACCCAGAGCTCACGTGGACCTGGGAGTCGGGTATACAGCTGTATGTGCCACAGCAGCCCTGCGATGCCCTCTCACGGATAT666CAATACTGCCCAAAATCTTTTTTT 



1114 
74880 



1154 
75000 



VPMLHDNVTESLRRITASGGRLNPTEPLPIFGGLRPATSA 1194 
AGTGCCAATGTTACATGATAACGTTACCGAATCGTTGC6TCGTATAACAGCATC6GGGG6TCGCTTAAATCCCACCGAACCCCTACCCATCTTCGGCGGACTACGTCCTGCTACATCGGC 75120 



VZV DNA sequence 1787 



1234 
75240 



etiPcniS»CEFViMPVSTDL0YFRT*CNPR6RASGHLYM 

^g^^ 7 **° 

K tatI^ ™° 

cgc\aa^^ 7 *°° 

r < «: n i i y L R T A H A G E T 6 A D E V H L A Q Y L I R 0 A S P L R G C L P L 1394 

ATaTCAAKGATGCGGCCATGTU^ ^ 

, 1396 

TttGCGATUTnCACCACGCCCACATACCCACTCCCAATAAAAGCCCTGTAGAGCGCATTGGCATCTT 75840 

uiUPPFlEVLLPGELSPAETSALQKCEGKIITFSTLRH 38 

IACAACAT6GCTATGCCATTTGAGAU 75960 

o i « i v n i l L S S Y Y I H G A P P 0 T I S L L E A Y R M R F A A V I T R V I 78 

cga^Lc^^ 76080 

CC6GGAAAGTTGTTG6CGCATGCCATT6WGT 76200 

ttcgL™ 76 *° 

D i * i r i fl I L P H E V L R G A 0 V I C Y M G R R V E L E T H L 0 H R 0 G S 0 198 

cg^ggJto 

. i t p t I V L N L M F S I N E G C L L L L A I I P T L L V Q 6 A H 0 6 Y Y N L 238 

6CGGCTATTCKACATTGGTTTTAA 76560 

. ,nTi«rvfiETfiOLI*IPPMPRIQOGHRRFPIYETISS«I 278 

TTGATACAAACGGCCAATTGCGTTAGAGAAACCGGCCA6TTAATTAATAT 76680 

e t c c p i fi D T L G T R A I L R V C V F D G P S T V H P G 0 R T A V I Q ¥ - 316 

TCUCATCATCTAMCTGC^ATACCIT 76800 

GGTGTTAATAAAAACACAACCAGTCTAGTTACATTTCACGCGTCTTGTTnnm "JJ 

ACCATCAGATCATCTGACATTGTTCCC6TGGTACCTnA ™ 

VML00SMT6TT6KGTYTKTDLMGMG6KIYETLH0LQICCLT «« 

TCCACG^TCTUACCTAACmA^ ™ 

EYPOV6LICVTLSVLEQSAIIHGSHFRKVFYDVAPTKQIUL <w 

UAAAAAAAGGGTACGCCACGCTACATCCG^ ™» 

L F F P Y A Y S C G P P I S H Y F L L V P A S S L V N G r E L V A F Q Q A I * L 

S I A V A S D Q S S N 6 E I S V H Y N T F T H P H L A L I H 6 L C 0 T V C R 6 I W 
TCACTGGAAGACGTGCCAGTTA^ 

E S S S T G T L A R L F F H E L G F I I « 0 S t Y R G V I A Y G T G S A R T N S 
GTAAATKAG6GTCTACGTAAACGTACAACACTGACGATAATAW 

T F A P 0 Y Y Y Y L Y S S L I A C N A V T S P R Y L I F R E R A T I T F L P R 0 134 

AACtCCACACCCCCAGA*nCT6TnACGCCCACCTACA^ 77760 

F G V 6 G S N Q I R G G V I E Q V F S D A M F L D A T R R M A G 0 M T I F V P I 34 

TnAATACATUCACGAACAAGCTGTGACATCGCTATGTGCTAAAACACM 77860 

HLYYCSCATY0SHALYRPMHD0CYYTVYNLLQDSS6RLHY 54 

AAAAAACTTGTACTTGCTTTTCCGGTATnGnGATGAAACAAAAATAATTTTACAAnGGnTGATTTAAAAATCCGACTATAGTT ™°0 

L F s J S A I G T N T S S V F I I I C N T Q N L F 6 V I T Q V A D P R I F N A E 14 

TCCACAAACAGAA6ATTAAAATCnGACCTC6GATACCCTGGAAC 78120 

DVFLLNFDQGRIG < spliced ttfM 45 



77520 
174 
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43 



A R Q E L V A 424 
AGCTCGCCAAGAGCTAGTCGC 79440 



464 
79560 



TAC6TACATATnCTTnTnTnCCAGTACAACCATATCCGGTC^ ^ 

VVIPMACLIAICIDPALLILIKP6QRFKVEV0TRYHATGO u 
CGTTGTGATACCTAACGCGGGGCTTmGC^ ^ 

CEPMCQVFAAVIPONALTHLLIPKTEPFVSHVFSATHNSG *u 
ITGCGAACCGTGGTGTCAAGTTTTCKC^GTACATTCCCGATAACGCCTTAACAAATCTCnAATACCAAAAAC^ 7g ^ 

GLILSLPVYLSPGLFFOAFNVVAIRINTGNRrHRnifiuv 
G6GATT0ATTTTATCATTWCTGTTTATCnAGCCCCGCTnATTCTTTGATGCATTTAACGTT6TAGCGATACGAATA^ 78 £J 

A E L I P N C T R Y F A 0 G 0 R ¥ L L L C K Q L I A Y I R C T P R l 1 S S I r I «j 
TGCAGAACTAATCCCAAACGGAACGCGTTATITTGCTGATGGACAAC^TACrTnATTATGCAAACA w ™ 

Y A E H M V A A M G E S H T S H G 0 N I G P V S S I I 0 L 0 R 0 L T S C G I 0 0 »i 
ATACGCACAGCATATGGTGGCAGCCATCGGTGAATCACICACGTCAAATG^ACAATATTGGACCCGTT^ ^ 

SPAETRIQEHHRDVLELIICRAVNIVNSRHPYRPSSSRViS Xi 
CTCCCC TGCT6A AAC ACGC ATAC AGGAA AA IA ATCGGG ACGTCCHGAGCTAA T AA A ACGGGCCGTAAACATTGT TUC TC^ 78960 

GLLQSAKGHGAOTSHTOPINNGSFDGVLEPPGOGRFTfirr vu 
TGMTTGCTTCAAAGTGCAAAG6CCCACGGAGCGCAAACTTCCAACACAGATCC6ATCAATAACG6TTCCTTTGAT6^ 7 ^0 

NNSS * SIPPLQOVL LFTPASTEPQSLMEIFOICYAOLVSG 344 
AAACAATTCGTCCGCCAGCATCCCACCTTTACAAGACGTTCTATTGmACCCCACCTTCGACAGAACCCCAAAG 79 ^ 

DTPADFKKRRPLSIVPRHYAESPSPLIVVSVHGSSAiGGR Mi 
GGACACTCCAGCAGATTTCTGGAAACGGCGTCCCCTATCAATTGTACCGCGACATTACGCAGAATCCCCCAGTCCGTTGATTGTAGTATCTTACAACGGATC^ 7933, 

I T 6 S P I L Y H S A 0 A I I 0 A A C I N A R Y 0 N P Q S L H Y T 
TATTACCGGAAGTCCAATTTTATATCACTCTGCACAGGCTATTATTGATGCTGCGTGTATAAATGCCCGGGTTGACAATCCCCAAA6CCTACATGTGAC1 

R L P F L A N V I N N Q T P L P A F K P G A E N F L N Q Y F t 0 A C V T S L T 0 
GCGTTTACCGTnTTGGCTAACGTCCTAAATAATCAAACCCCCTTACCCGCCTTTAAACCAGGCGCCGAAATGTTTTTAAACCAGGTATTTAA 

GLITELQTNPTLQQLMEYOIAOSSQTVIOEIVARTPDl Tn 
AGGTCTTATAACGGAGTTACAAACGAACCCGACTCTACAACAACTCATGGAATATGATATTGCAGATTCTTCCCAAACG6TTATTGATGAAATTGTAGCCCK^ 

T I V S V L T E M S H D A F Y N S S L M Y A V L A Y L S S V Y T R P 0 G G G Y f <aa 
GACTATAGTTTCGGTGTTAACGGAAATGTCAATGGATGCGTm ^ 

P Y L H A S F P C i L G N R S I Y L F D Y Y N S G 6 E I L r L S IC V P V P Y A L «4 
ACCC TACC TTC ACGCTTCCTTCCC ATGC TGGTTAGGT A ATCGTTCT ATA TATTTATT TGAC T A TT ATA ATTC AGG AGGGGAA AT ACTT A AGC TTTCC AAGG TCCCCGTTCCCGTAGCC TT 79920 

E K V G I G H S T Q L R 6 r F I R S A 0 I V 0 I G I C S It Y L P G Q C Y A Y I r »i 
AGAAAAGGTTGGTATTGGTAATTCCACACAACTGAGGCGTAAATTTATACGCAGCGCGGATATTGTTGAIATTGGAATnGTTCTUGT^ 80040 

LGFNQQLQSILVLPG6FAACFCIT0TLQAALPASLIGPIL 
TCTAGGATTTAACCAGCAATTACAATCCATTTTAGTnTACCGCGGGGATTTGCGGCATGTTTnGTATTACCGATACCC ^ 

DRFCFSIPNPHK-* 
AGACAGAnCTGCTTCTCTATTCCCAACCCCCATAAATAAATTAGTGTCACTATAAUACATAACACCAGAATCTCTTCHATGTAAnHACGTCAnK «0280 

TAAAATATAAAATAACCG6GTGGGTGGCATTAAACCCACAAGTACCCGG6CGGCAATCCGCTAGACTGnTTTCT^ ^ 

AARKLTPEAYQRLCDALTLOMGLWKSILTOPRYIlMRSTi o 
TGC AGCGCGC AAATTAACCCCCGAGGC AGTTC AG AGACTC TGCGATGCATTA ACGCTGG AT A TGGGAT T ATGG AAGTCCA TCCTGACCGATCCCCGG6TGA AA A TA ATGCGA TC AAC TGC 80S* 

EITLRIAPFIPLOTDTTHIAVVVATIYJTRPROMNLPPrT <u 
TTTT ATA ACTTTA AGGATCGC TCCGTTTA TCCCCCTTCA AACGG AT ACTAC TAA T A TTGCCGTTGTTGT AGCC AC AA TTT ACATC ACGCGCCCACGTCAG ATGAACTTICC TCC6AAG AC 80640 

F H V I V N F H Y E V S Y A M T A T L R I Y P V E H I D H V F G A T F t N P I A «i 
TTTTCAT6TAATTGTAAATTTTAATTACGAGGTCTC6TACGCA^ M ™ 

VPLPTSIPOPRADPTPAOLTPTPNLSMYLOPPRLPCNPYA 17* 
6TACCCCC TTCC AAC A TC T ATTCC6G A TCCTCG AGCAGA TCCC ACCCCCGC AGA TC TTAC ACC A ACGCC A A ACTT AAGC A ACTAC TTACA ACCCCCGCGGCTTCCGAAAAATCC ATACGC 80880 

C It V I S P G V i K S D E R R R L Y V L A M E P H L I G L C P A G I H A R I L G P« 
ATGTUAGTTATTTCTCCGGGAGTGTGGTGGTCAGACGAACGAAGGCGTnATATGTACTGGCTUGGAACCTAATTTAATAGGGCTATGTCCCGCCW^ 81000 



SVLHRLLSHAOGCOECNHRVHYGALYALPHVTNHAEGCV 
CTCTGTATTAAATCGACTCCTCAGCCATGCGGACGGATGTGATCAATGTAATCATAGAGTTCACGTGGGGGCACTGTATGCGTTACC^ 



C 254 
GTGTG 81120 



T A 54 

CTDC 80520 



t T 94 
AGAC 80640 



I A 134 

TCGC 80760 



1 A 174 

ACGC 80880 



L 6 214 
TTGG 81000 



V C 254 
■TGTG 81120 



VZV DNA sequence 



1789 



AAATCCTCGCATTACCGCAAATCTTGGC^ *™ 

ggJtattattucggxtto 81480 

A6CTATAT6TCAACATnACAGTAATATATTAAA*GG^ *™ 

aatagattcgcct^ 8 ™ 



81840 
252 



«Ti tt icTmtGT6ac»n«iiattTjm»™ 

«u«™gt^^ 8 ™ 

TTC«KGTCra™ « 

ktktgjckcct^ M3 £ 
tgcccgtcugckugccgaguc^^ 

6»m«™GcsmccuG««ccn6c^^ K5 ^ 

KU»GCGHCG»CCWC«mra^^^^ •» 
GLTR6FHILSM 

M S G H T P T Y » S H R R H R Y C L » E » H K R » 6 I. 27 

uiuiuTinciaiMe^ 82,00 

nlLc!^™^ 82920 

^mGGm»u!J»GKG^^^ 83180 
y«inniPPNL0ISPT»6PLRSHMMT06HEPNll»»00Q 38 

, t 0 . y L T I H P P T $ I Y L 0 I IHtmtlflllMMMMi 187 

Titticma»fGci<acGAC»cl!tt^ 83280 

£ R E S T « P I H 6 C V N H P I « « P 5 I t I C U E $ P E R S Q Q T S L F I L I » 

Liettuicatuttecu^^ 83400 

u r. i I 8 0 P I H 0 R E R Y 0 V F P Q f N I P P I V V R I S I L S R L I » P I F 118 

««Ucnutt«»6^ 83520 

ii«E0LCFSClQIRDRPRF»6RGTY6R»HIYPSSri»»lT J58 

CIC6CTCUT6UCi6TT»T6TTTnCT4^^ 83MQ 

u n c p . F « R E L 1 M » I I » S E G S I R » 6 E R l G I S S I Y C L I 6 F S L 198 

CiraOTGTT^^ 83760 
n ,roilFPIYDIIOIIOEYIYRLSRRLTIPOHIDRtI»H»FL 238 

K*UttMW*6tTWT6TTTCC6GCMMGAttTK^ 03880 
m limflintlUHMICillFLIHIFIHEnil 277 
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48 



YIGDYSLVT.LNTYSLCT 



QMSFRLVLSH6TNQPPEILLDYIN6T6LTKYT.GTIPQRVG 357 
GCAGATGTCATTTCGTTTGCTGTTGAGTCATGGAACAAAaAACC^ 84240 

LAI0LYALGQAILEVI116RIPG01PISYHRTPHYHVY6H 397 
ACTT6CGATTGATCnTATGCAnGGG€CAAGCACTCnAGAAGTTATCCTG£TAGGACGTCTTCtt 84360 

KLSPDLALOTLAYRCVLAPYILPSDIPGOLNYNPFIHAGE 
TAAGTTATCACCAGATTTGGCGCTTGATACGCTGGCATATCGATGTGTCn 

L N T R I $ R N S L R Rh F Q C H A Y R T 6 » T H S * L F E 6 I R I P A S I I P 
GCTGAACACCCGTATTTCCCGGAAncrnACGCCGGATATTCCAGTGTCACGCAGTGCGTTACGGCGTAAC6CACTCAAAGCnTTCGUGGCA 

MARS6L0R1DISPQPAKK 18 
A T V V T S L L C H D N S E I R S 0 H P L L « H 0 R 0 I I G S T - 509 
AGCCACTGTTGTTACATC6TTGnGTGTCACGATAATTCAGAAATACG€TC^ 84720 

IARVGGLQHPFVKTDINTIHVEHHFIOTLOrTSPHMOCR 
ATTGCCCGTGTGGGAGGTCTACAGCACCCTTTTGTAAAAACGGATATTAACACGATTAACGTTGAACACCATnTATAGACACGCTACAGAAGACATCA^^ 

MTAGIFIRLSHMYKILTTLESPH0VTYTTP6STNALFFKT 
ATGACAKWKiTATTITTATtCGTTTATCCCACATGTATAAAATTCTAACAACTCTGGAGTCTCCAAATGATGTAACCTAaCAACACCCGGTTCTACCAACGCACTGTTCTTTAAGAa 

STOPQEPRPEELASKLTQDDIKRILLTIESETRGQGDKAI 
TCCACACAGCCTCAGGAGCCGCGTCCGGAAGAGTTAGGATCCAAATTAACCCAAGACGACATTAAACGTATTCTATTAACAATAGAATCGGAGACTCGTGGTCA 

VTLLRRNLITASTLKVSVSGPV1PPQNFYHHNTT0TYG0A 178 
TGGACACTACTCAGACGUATTTAATCACCGCATCAACTCTTAAATGGAGTGTATCTGGACCCGTUTTCCACCTCAGTGGTTnACCACCATAACACTACAGACACATA 85200 

AANAFGKTNEPAARAIVEAIFIOPADIRTPOHLTPEATTK 218 
GC6GCAATGGCGTTTGGAAAAACCAACGAACCGGCGGCACGAGCGATAG™ 85320 

FFNFOHLHTKSPSLLVGTPRIGTYECGLLIOVRTGLIGAS 258 
TTTmAATTTTGACAT6CTCAATACCAAATCTCCAAGTCTCCTTGTGCGTACACC^^^ 85440 



LOYIVCDRDPLTGTLNPHPAETDISFFEIXCRAKYIFDPD 298 
TTGGACGTTCTTGTATGTGACAGGGACCCTTTAACTGGCACCCTAAATCCCCACCCTGCAGAAACCGACATTTCATTTTTTGAAATTAAATGTCGTGCTAAATACCTCTTTGATCCAGAT 85560 

0KNNPLGRTYTTLINRPTMANLRDFIYTICNPCVSFFGP5 338 
GACAAAAATAACCCGCTCGG1CGGACGTACACCACGTTAATAAATAGACCTACAATGGCAAATCTACGGGACTTTTTATATACTATAAAAAACCCATGTGTAAGCTTCTTTGGACCCTCA 85680 

ANPSTREALITOHVENKRLGFXGGRALTELOAHHLGLNRT 378 
GCAAACCCAAGTACACGCGAGGCCTTAATAACG6ATCACGTTGAATGGAAACGTTTAGGATTTAU 85800 

ISSRVIVFNOPDIQCGTITTIAVATGDTALOIPVFANPRH 418 
ATCTCATCCCGAGTGTGGGTATTTUTGATCCGGACATACAAAAGGGGAU^^ 85920 

ANFKQIAV0TYVLSGYFPALKLRP'FLVTFI6RVRRPHEVG 458 
GCTAACTTTAAACAAATTGCCGTACAAACCTAT6TATTATCCGGnACYTTCCAGCGCTAAAACTA^ 86040 

YPtRYDTQAAAIYEYNWPTIPPHCAYPVIAVLTPIEYDYP 498 
GTCCCATTGCGCGTCGATACACAAGCGGCT6CCATTTACGAATATAACTGGCCGA 86160 

RVTQILKDTGNNAITSALRSIRIONLHPAVEEESYOCAMG 

49 HGQSSSSGRGGICGLCIR 
AGAGTGACACAAATACTTAAAGACACAGGAAACAACGCGATTACATCAWATTGC^^ 

TTSLLRATEKPLL- 

YHELYTCNCETVALNSEFFEDFOFOENYTEOAOKSTQRRP 
AC A ACGAGC TTGTTACGTGC A ACGGAGAA ACCGTTGC TTTGA AC TC AG AGTTC TTTG A AG ACTTTG AC TT TGA TGAGAATGTAAC AGAGGACGCCG A T A A A 7CCACAC AACGCC6CCCAC 

RVIDYYPKRKPSGKSSHSKCAKC- 

6AGTGATCGATGTAACACCAAAACGAAAACCTTCGGGAAAGAGCTCCCATTCCAAATGCGCAAAATGTTAAACCCT6ATAAACCCTGATAAACGTTCTAATAAAAACATCAAATCATGGT 

TGGTTACTGTGAATGTTTGTTTTATTGCTTGGGGGTTTACAAGTACAACCCACGCTACTCCCACCCACTGTTTGATCC<TCGTATAACAGCTCATCCTCGCGGTCCGTUCATATGT 

50 * -EIGSNSREYLLEDERDTEYTS 

GTCATTTTCATA6ACGTAGCCGTAGCCTTGTGATGGGTAATTTGTGCGGCGAGAATTTCTATGTGCAGGTTTTACTTTTCGTATGTATCCCC6TACCCGCTCGGGTACTCTTCTTACGGC 
DNEYVYGYGOSPYNTRRSNRHAPKVKRIYGRVREPVRRVA 

ACC6T AGA ACCG ACTGCGTTTC TGTCG A TGAT ACACAT ATGC ACGC ATC AATCTGAGAAGC A ACAT6AC AACGGAA AAC ACGGCCAG6C AAGCCA AGGTTCCCCGAGTTGTGGGAAT T A A 
GYFRSRKQRHYVYARMLRLLLNVV5FVALCALT6RTTPIL 



CCGTGG 
R P 

AIGCGC 
H A 

CAGCAT 
L M 

GACCGC 
V A 

TA1TAI 
L L 

TCGCTC 
R H 
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IIIIHTEOLKGRGCFTTTFIiST 

S mnnac^nTm.KCicuTouuuc^cc.m^™ T 

iisK mnillLLILSLll!IFlTIPOHli»M>>i 

- mTCTJT—TOJ— " * 

Jf 111™—— 8 "« 

" ucPHT6£SNA»VT»SST0L»B»L»66"> » 

^.CU^mKlG.KKG^ ^ 
SfrESRPGKIQTGM 

•ir(g; , u( TccTVTUfiFlRLlVQLESLHRVSSEAI 147 

jf^oU^™^^ 88800 

^uu^gk™^ ™" 

Jf^mKGcL™ 89160 

^«U,Ugut™gL,U^ ^ 

l^GGKWGK.^^ G9520 

l^ui!™^ 89760 

H . . . t i f « u f S I C G C » I 0 » L 0 T 0 I It T L f N Q H 6 D S I » fl I I F 

WTintCtHUCKTCMGIGSUtCCGTCCWCGH 89880 
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E V M R C N V TOAKIIINRPVIRTTGFLOGCHKQCFRPIPTKH 707 

TGAGGTTATGCGCTGTAACGTTACTGAC^AGAnATATTAAACCGCCCGCT^^ 90000 

EYNIAIFRLIVEQIFGARVTKSTQTFPGSTRVKNLKKKDL 747 
CGAATATAACATTGCTCTATTTCGTTTAATTTGGGAACAATTATTTGGCGCCCGCGTAACTAAAAGTACCCAGACCTTTCCGGGAAGTACTCGTGTGAAAAACCTAAAAAAAAAAGATCT 90120 

ETLL0SINVDRSACRTYRQIYNLLMSQRHSFSQQRYK1TA 787 

AGMACTTTACTTGATTCAATTAAC6TGGATCGTTCTGCAT6TCGT 90240 

PAVARHVYFQAHQMHLAPHAEAMLQLAISELSPGSIPRIN 827 

CCCCGCTTGGaACGCCACGTGTAnTTCAAGCACATCAAATGCACTTGGCCCCGCATGCCGAAGCCATGCTACAATTAGCGCTATCGGM^ 90360 

GAYNFESL- 335 

CGGGGCGGTAAATTTTGAAAGTTTATAACCCGTTAATACCATATATGGACATCCATAGGGGGGGTTACATAAATACTAAGCCTCTGTACAACACAAAGG6C 90480 

MDATQITLVRESGHICAASIYTSWTQSGQLTQNGLS 36 

CACAACCAAGCTATGGACGCAACGCAGATTACCTTGGTTAGAGAAAGCGGACACAnTGTGCCGCAAGCATATACACATCCTGGACACAGTCCGGACAATTAACACAGAACGGTCTU 90600 

Y L Y Y L L C r H S C G K V V P I F A E I T Y 0 0 E 0 L C R Y S R H 6 G S V S A 76 

6TGTTAT ACT ACTT ATT ATGC A A AAAC TC ATGTGGGA A ATADiTCCC T A AGTTTGCCGAAATT ACUiT AC AAC AAGAGG ATTTATGTCGC T ACTCC AGGCATGGGGGGAGTGTTTCTGCG 90720 

ATFASICRAASSAALOAWPLEPLGNADTVRCLHGTALATL 116 

GCAACGTTTG€GTCTATCTGCAGG6CGGC6TCCTCGGCTGCGTTAGACGCCTG&^ 90840 

R R Y L G F t S F Y S P V T F E T 0 T N T G L L L I T I P 0 E H A L N N 0 N T P 156 

CGGCGCGTATTAGGGTTTAAATCGTTTTATTCGCCA6TAACATTCGAGACTGATACGAATACAGGTCTTCTGTTAAAAACAATCCCCGATGAACACGCGTTGAATAATGACAACACGCCA 90960 

S T G V L R A N F P V A I D V S A V S A C N A H T Q C T S L A Y A R L T A L I S 196 

TCTACC6GAGTATTGAGGGCTAAnTTCCCGTGGCCATTGAT6TTTCAGCAGTCAGCGCATGTAACGCCCACACGCAAGGTACGTCGCTAGCCTAC6CCCGCCTGACCGCACT^ 91080 

NGDTQQQTPLOVEVITPKAYIRRKYKSTFSPPIEREGQTS 236 

AACGGTGACACCCAGCAACAAACACCTTTAGACGTGGAGGTAATTACACCAAAGGCCTACATACGTCGGAAATATAAGTCTACGTTTTCCCCCCCTATAGAGCGGGAAGGCCAAACCTCC 91200 

OLFNLEERRLVLSGNRAIVVRYLLPCYFDCLTTOSTVTSS 276 

GATTTGTTTAACCTTCAAGAACGCCGCTTGGTTCTTAGTGGCAATCGCGCAATTGTGGTAAGGGTACTCTTACCGTGTTATTTTGACTGrTTAACAACGGATTCCACC6TTACATCTTCC 91320 

LSILATYRLWYAAAFGICPGVVRPIFAYLGPELNPKGEORO 318 

CTTTCAATATTAGCAACATATAGACTGTGGTACKGGCGGCGTTTGGAAAACCCG 9M40 

YFCTVGFPGVTTLRTQTPAVESIRTATEKYNETDGLWPYT 356 

T AC TTTTGTAC TGTCGG ATT TCCCGGATGGACCAC TCTTC6GAC AC A A ACTCC AGCCGTCGAATCT ATTCGC ACGGCTACGGAG ATGT ACATGGA A ACGGATGGGTTGTGGCC AGT AACC 91560 

G1QAFKYLAPI6QHPPLPPRVQ0LIGQIPQDTGHADATYN 396 

WTATTCAC^TTTCATTATCTAGCCCCCTGGGGACAGCATCCCCCCTTACCTCCGCGGGTGCAGGATCnATTGGGCAAAK^ 91$8 0 

NOAGRISTVFKQPVQLQDRVKAKFDFSAFFPTIYCAHFPM 436 

TGGGACGCGGWCGGATATCTACCGTCnCAAACAGCCTGTACAACTACAAGATCGTTMATGGCAUGTnGAmCAGCGCCm 918 00 

HFRLGKIVLARMRRGMGCLKPALVSFFGGLRHILPSIYKA 476 

CATTTTAGATTAGGCAAAATCGTCCTGGCTAGAATGCGTCGAGGAATGGCGTGCCTAAAACCCGCGTTGGTGTCnTTTTTGGG^ 91920 

IIFIANEISLCVEQTALEQGFAICTYIKDGFVGIFTOLHT 516 

ATTATTTTTATAGCCAATGAAATTAGCCTTTGC6TCGAACUACGGCCTTGGAACAGGGCTTTGCTATATGTACTTATATAAAAGATGGATTTTGG6GAATCTTCACCCATTTA^ 92040 

R N V C S 0 0 A R C S A L N L A A T C E R A V T G L L R I Q L G L H F T P A II E 556 

CGCAATGTATGTTCAGATCAGGCACGTTGTTCtMJCCnAAATn^ 92 160 

P V L R Y E G V Y T H A F T W C T T G S I L I N L 0 T N T P P 0 I V G Y P N R S 596 

CCGGTACTCCGGGTCGAGGGTGTGTACACTCACGCATTTACCTGGTGTACCACGGGAAGCTGGCTGTGGAATTTACAAACAAK 92 2 eo 

0 A A R 0 L I E R L S G L L C T A T I I R E R I Q E N C I M D H V L Y 0 I W A G 636 

CAGGCGGCGCGAGATTTAAAGGAGCGTCTTTCAGGACTCCTATGTACCGCA^ 9240Q 

QVVEAARKTYVDFFEHVFDRRYTPVYVSLQEQNSETKAIP 676 

CAAGTTGT6GAGGCTGCCAGAAAAACATACCTCGATTTTTTT6AACATGTTTTTGATCGCCGTTATACTCCGGTATACTGGAGTCTTCAGGAGCAAU 92520 

ASYLTYGHMQOKDYKPRQIINVRNPNPHGPPTVVYHELLP 716 

GCATCTTATCTGACATACGGACACATGCAAGATAAGGATTATAAACCAAGACAGATAATTATGGTTCGTAATCCCAACCCACATGGACCTCCTACTGTTGTTTACTGGCAATTGCTACCA 92G40 

SCACIPPIOCAAHLKPLIHTFYTIINHLLOAHNOFSSPSL 756 

TCGTGTGCCTGTATTCCCCCCATAGACTGCGCTGCTCATCTCAAGCCCCTTATACACACGTTT6TCACTATTATTAACCATCTTCTAGAT6CTCATAATGATTTTTCAA6TCCATCATTG 92760 

IFTOOPLASYNFLFL- • 771 

AAATTTACTGACGATCCCCTTGCTTCATATAACTTCTTGTTTTTATGACAAAAAAACACGCCGCAACAACCCATCCTTAAAATAAAAGGTTTATTTACTTTACAACCC6TGCTGAATTTT 92880 

-KVVRPSNK 324 
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TATlttTllUUTWTO^ 93000 

Y T E F L Q Y N I P T V M T R H L V N F Y V S t Q V E S 0 R L P 6 N I L 6 P P V 284 

CirciCCUACCnCTGGACIUtt^ 93 ™ 

Q V Y G 0 P C P P T 6 H P I 0 F R 0 6 V 6 C S Y I A I I I N P A T 6 P Q P N I L 244 

tfnAACAAACATATAATGCGfKAATACGC^ ^ 

IYFMYHPLVRTKTFPNN0YYYMLTNLLILV6EYLFRRMKE 204 

ttAAAGGTCCTATnCICACTCAAtttGTCTAAGATATAU^ 93360 

I P 6 I Q C E Y R R I Y V S L Q V F L S P S A 6 S P 6 I G E L Y M L I V C L T F 164 

AATCTAGGACATTATCCGGttWAATAGAGTCATCCGATAGAnAA 93480 

0 I V M 0 P R F L T M R Y I L I R P P V G V T Y Y R 0 E V A T L V S F I F G R F 124 

AC6CTG6ffGAGTAACACACTCCATGTA6TAAC6ATCACA6GACACCTCACTTGAATCACCATTC^^ 93600 

APQTVCEMYYROCSYESSOGHLYVLVTEQHEPCYRLSVYS 84 

AGTnGCCAUAAKGTGGCnCAAACCGGnACCTCCCGC^ 93720 

NALFRPKLGTVERAECVFRPI AQIRLOE IYDSVDLGEEAR 44 

" * L -SRRDRRECGVGRRS 756 

GT6nAGTAAGnGTC6ATCGnACGCTGCAACCTAAAATGCT6GGTATATnAnCCGGACATCCCATCGGCCATCCCCtt 93840 

T I L N D I T V S C G L I S P I H I G S H G 0 A H 6 A 6 G T Q E F t I I Y P R I 4 

TNTLQR0NRQLRFHQTYKMRVDWR60GRRRHARIK0LLTS 716 

TCCGCTGCATTTACCTTGTGTACCCGTA ACCTCTC AGGGGGGTGTCC TTTC AT* A A ATGGGATAGGTTTTTAT ATCC A AC ATGC ATGTATT6GTT ATT7ATTTT ATT6GGTTCCG664TT 93960 

oVa\ V I H V R L R E P P H G K H F H S I K K Y G V H H Y 0 H N I I N P E P I 676 

CTnCGTCATCnCTGTAGGGTCAGGCAAACCCCAGGAAGGACTTGGTGTTCTtt^ 94080 

REOOETPOPLGiSPSPTRRPGRKIVEARVQHEYLIRIQSL 636 

TAGGACTCTGnCTCGCCTTnTAAAUTAGCCTGGCA^ 94200 

YSETRAKKFIAQCLEEESRHVESQTVIFGIRTARLIFSFY 596 

GttGCUCTAGTAATGAGATTGACGCAmGAATATGATACAGAAAmC^ 94320 

P A Y I t S I S A M S Y S V S I E Q G Q N N N V R H L I F C R Y L E Q I * L E S 556 

UACGTTnATATCATCTCCATAAGGGG6GATATAACGAGATTGAAAA 94440 

LRIIDDGYPPIYRSQFSNAIYAD06II6TLDIVEHMLDAI 516 

CGTGTnCTTCTKCAn6TAATAT6TGACCCTffAGATGGCTTTATT 94560 

RTEEAMTIHSGItSPKIItYREERLRKLE6EKFQLRETLDSM 476 

ACATCCTTGAGACCCTCAATGGTnTGAATAAATTATTCACATA^ 94680 

Y 0 I L G E I T I F L N N V Y G E L M G H I S H V V S T t F A S Q Y N T T G S M 436 

GaCCCCCGnAAAGGAnGGnGGCCTTttCAAAC^ 94800 

AGGHFSQNACGFGPQST0VSGS6ELIHHGTEELYSRVTET 396 

ATATCTCCAACAT6TCTCATGTTTTnAA6TTAACTAnA6CT^ 94920 

IDGYHRHNKLNVILrVLRSAASGARTTONEGMIROVARTS 356 

TGA6ATCTATCATCTTCAnCCGWGACCTATTAACACGCttAA 95040 

H S R D D E H R R G I L V R L P A T K L V Q C V R A H E R L A Y C A L V E 6 S L 316 

CGCTGTAA6GGTGAGTCAAATACTACACCCTCCCCACATACAACGGGCGGCCACAK^ 95160 

R Q L P S 0 F V Y 6 E 6 C V V P P V V V L C E G I H 6 T V 0 0 I A M I L R R N V 276 

TT ATAAT A AAG ACGCGTCCT ATC AT A ATCC ATA AT A6C A AC AnTTGC AT AC KTC AAC T AGGCTTGT6ACA ACCGCCKTCCTCTGGCC A AC6TTGC ATCGGC A ACTTTT A AC ATCTGG 95280 

NYYlRTRDYDMIAVNQIICEVLSTVVAAGRALTADAVKLliQ 236 

GACAGTTCTGCCGCTTGACCCATATACGTATTTAATGGTGCAGGGGTTCCATU^ 95400 

SLEAAQ6MYTNLPAPTGHQESRVKRVVPYIGVCAIIDYYK 196 

GCUAACCGACCCTTCCATnAAACCACTGGTATAGAGACAACCGGTTAnCCACGCAGAAACTCAAGTAACGATGACTGTAATGTnGACGCCAGGTnCAAA 95520 

A F G Y R G N L G S T Y L C G T I G R L F E L L S S Q L T Q R * T E F V Q H A I 156 

C6T ACGGCTTCTG AnCtCC AC AT AGCCC ATAACGTTCCGC T AG AGCCCC6GC ATGC AGGTT AC ATT6nGG ATGTGGTGHCCC A ATCTGCTGCT 95640 

R V A E S E 6 C L 6 Y R E A L A G A H L N C Q Q 1 H H E I 0 A A L D E Y R T A 0 116 

AACGCGnCATCAAAACGOnttCTGAACnGGCGAATTACAGTnCCGTAGACCGTACAGCGCTATATATGCCTTCTCCATCGGTATATCCA 95760 

LANMLYTAQYQRIVTETSRVASYI6060TY6FD6ALIKRF 76 
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AACATACnTGCGTGGnGGGTGTATTAACATCCAGCUTCTTCCTCC^ 95880 

LNSQTTPHILMR6DEEPFTCFGIDPAYENtfYIDFNNKYQD 3$ 

» « t 2 

TTTGGGTTACTTCCATTCAAGCCCTGGTCAATAGAAACAGAACTTGCTATCCTTTW 96000 

NPNSGNLGQDISVSSAIRKEESGSSNNFLSTIEAH 1 

RSISVOSSSPrM¥FNPETPN6FD0SVVLMFTS«HSI0PIL 42 

AAGATCAATTTCTGTAGACAGTTCTTCACCCAAAAACGmnAAT^^ 96120 

S R I R E L A A I . T I P I E R V P R L C M F t Q L L E L Q A P P E H Q R N E I P 82 

CTCACGGATTCGAGUCnGCCGCAAnACGATTCCAAAAGAACGTGm 96240 

F S ¥ Y L I S G N A G S G t S T C I Q T L N E A I D C I I T 6 S T R V A A Q N V 122 

CnCTCCGTTTATnUTTAGCGGAAATGCC6GCTCC6GAAAAAGCACGTGTATCU^ 96360 

HAICLSTAYASRPINT1FHEF6FRGNH1QAQLGRYAYNWTT 162 

TCATGCTAAGnATCAACGGCTTATGCGAGTCGTCCGATAAACACAATCTTTCATGAATTTGGTTTTCGCGCUATCACATTCAGa 96480 

TPPSIEOLQKROIVYYWEVLIDITKRVFQHG. ODGRGGTST 202 

GACCCCCCCTTCTATTGAGGACCTGCAAAAAAGAGATATTGTATACTACTG^ 96500 

FKTLVAIERLLNKPTGSHSGTAFIACGSLPAFTRSNYIVI 242 

ATTTAAAACCCTGTGGGCAATTGA ACGTTTGCTTAATAAACC T AC AGGC TC AATGTCCGGA ACCGCGTTT ATCGCATGCGGnCCCTTCCGGCTTTTACCCGGAGCAACGTTA TTGTT AT 96720 

OEAGLLGRHILTAVVYCWKLLNAIYQSPQYIHGRrPVIYC 282 

TGATGUGCAGGAnGCTAGGGCGTCATATTCTCACGGCC6TT6W 96840 

VGSPTQTOSLESHFQHDttQRSHVTPSENILTYIICNQTLR 322 

CGTCGGnCGCtf ACCCAAACTGAtTCGTTAGAATCTCAW . 95950 

Q Y T N I S H N N A I F I N N I R C 0 E 0 D F G N L L I T L E V G L P I T E A H 362 

TCAATATACTAACATCTCACATAACTGGGCAATCTTTATU^ 97080 

ARLVDTFVVPASYINNPAHLPGNTRLYSSHKEVSAVMSKl 402 

TGCGCGTCTGGTCGATACATTTGTTGTACCTGCATCCTATATTAACAATCCTfHITAATCTTCCCGGATGGACGCGTCTG 97200 

HAHLKLSKN0HF5VFALPTYTFIRLTAF0EYRKLT6QP6L 442 

ACACGCKATTTAAAACTATCGUAAATGACCATTTnCTGW 97320 

5VEHMIRANSGRLHNYSQSR0H0HGTVKYETHSNRDLIYA 482 

TTCTGnGAACATTGGATACGGCCAAACTCCGGTCGTTTaACAATTATTCCCAAAGCCGAGATCATGACATGGGAACAGTTAAATACGAAACACATK 97440 

RTOITYVLNSLVVVTTRLRKLVIGFSGTFQSFAKVLROOS 522 

CCGTACAGACATCACTTACGTGCTAAATAGTCTCGTAGTTGTAACCACAAGACTACGTAAGTTAGTTATTGGATTCAGTGGTACATTTCAATCGTTTGCAAAGGTTTTACGTGACGACTC 97560 

FVKARGETSIEYAYRFLSNLIFGGLINFYNFLLNKNLHPD 562 

CTTTGTGAAGGCTCGAGGAGAGAC ATCC ATCGAATATGCTTACCGGT TTCTGTC A A ACC TA ATCTTTGGAGGCT TGA TTAACTT TTAC AATTTTTTGTT AAA TA AA A ACCT AC ATCCCGA 97680 

K V S L A Y K R L A A L T L E L I S G T N I A P L H E A A V H 6 A G A 6 I 0 C 0 602 

TAAGGTATCGnAGCATACUACC^TTAGCTGCCTTAACCCTGGAGTTATW 97 8oo 

GAATSAOKAFCFTKAPESKVTASIPEDPDOVIFTALNOEV 642 

TGGTGUGCTACTTCTGCCGATAAAKCTTC^ g7920 

IDLYYCQYEFSYPK$$NEVKAQFLLMKAIY06RYAILAEL 682 

TATTGACnGGTATACTGCCAGTACGAATTncCTATCCCAAATCATCCAATGAGGTCCATGCTCAGTTTCTGTTAATGAAAGCTATnACGATGGTCGAT^ 98040 

F E S S F T T A P F $ A Y Y 0 N ¥ N F N G S E L L I G N V R 6 G L L S L A I Q T 722 

TTTCGAAAGCAGCTTTACAACCGCCCCCTTTAC<GCGTATGTCGATAAT6TTAATTTCAACGCAAGCGAGCTTTTGATC(^AATGTGCGG 98160 

0 T Y T L L G Y T F A P V P Y F V E E L T R K r L Y fl E T T E M L Y A L H V P L 762 

AGATACGTATACCCTTTTGGGGTATACTTTTKACCCGIGCCAGTCTTT6TAGAGGAACTGACCCGAAAAAAGCTGTACCGCGAAACTACCGAAATGTTATAT6CTCTACACGTACCTCT 98280 

MVLOOQHGFVSIVNANVCEFTESIEDAELANATTVOYGLS 802 

TATGGTCTTACAOiATCAACATGGGTTTGTGTCCAT^ 3^ 

SKLAMTIARSQGLSLEKVAICFTADKLRLNSVYYAHSRTV 842 

TTHAAACUGCCATGACAATTGCACGCTCACAGGGTCTGAGTTTAGAGAAGGTAK^ 98520 

56 MKNPQKLA1TFLPLYVIPTYTLCI ZA 

S S R F L t U N L N P L R E R Y E t S A E I S D H I L A A L R 0 P N V H Y V Y - 881 

CTCCTCTAGGTTCTTAAAAATGAATCTAAACCCTCTACGGGAACGATATGAAAAATCCGCAGAAATTAGC6ATCACATTCTTGCCGCTCTACGTGATCCCAACGTACACGTTGTGTATTA 98640 



1795 
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LmG^G^^ 

• 244 

!cmm««!*^ 99360 

«»unie«neiwii^^ «» 
F 1 M -"t E r R t Vo"h 6 s m e i n i p o i i i i i i i i i. i i i o i n f £ n f > m 

VFMQLSKHSSPNFCMKSLTVHlflni»on» a 

Tjmucioocsnc^ 
cLctKUTauTCCccucccc.neuauTscceiteGT 

ffiiifiiiiiiii'iifftfi'i 1 

TGCGKCCC^c™^ 

KLEOSLLDYIEPSIHFKSFVtVSDLabSirir 
flimm.inilU«n^ 

MniC»TnMnGGCMW(iCGTT66l*CHKGGTCGGCTCCCCA6KAUTCU 101280 

M E . VcVbVf'l V/.'. V/cV* £ p , n t s s p t s 0 , . v 0 s t » d l « 

Cm««CGMTJTG^^ 

ttccggtict*»«t^ 
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T6AtttaTG«KTTGT6nnCttT^ 

1 

TU»CTCTTKTGTC»T»TCTUTTnCCTUU«^^ nm) 

T"C«6TMeGWTUGCmHKTCCCCC6mT^ ^ 

* A 
™G»CCGTmCMGCG6ttTGHUCaTCU»TTra^^ mv0 

< ft; 
"GTACCCCAAACTTAAAACGCTCAAATT^^ m ^ 

TAnTCTUGCATTCTTACTGCDTACnCGCACCGTGCnAAIATATCAACCUTITCCITTATGCTACICCrn 102360 

TGCATCATACCTAATTCACCTAAAAAKTGACTCATT6CAGCAGCCnTCCTCCTT6CAGACTATCCAGn$$CAnTTAUC666T^ WW 

ICTGTAAGTACAAAACTAAAATTTATATTTGCGTaGTAmTGTAACATATATGCCTTTTATCCCCCCGCAAGmGCTTTACC 102W0 

TT AATA ACTTTAATTGCT AT A AG ACA TACCC AAACCGGATGATTTTTGCCGC TGGAA A AK A6CTTCT A ATTTTCCCGTCTC AAC TCGGCCnGGTTGCATCTCC* AG 102720 

naraCeTAMBGTGTATAIATACUttfiCTBtt ^ 

ATTGTATnGCAGAGCAGGATGCCCCG6TTACTCCGAGACC6GATTGCGG6CATTCCGMTCGTGTAC6GAC 102960 

TACGACCAATAGCAACACTCAGGTATTTTTAAAATGCACGTTTAATGATCATAATTTACATACAGnGGTAATAAAGCAGACTGTGGATGTTT^ 103 O80 

ACTAMACTTCTTCATCTTGTTTGGAATACCTTTACCCGCTTTAC 
-SKKMRNPIGI6AICGASSKKTLHKLSGSTSGSTSPTNSE6 429 

CCTGTGGAGAI^AACTTTGCGGGTTTTACTTCCCTTACATGCCGAATCAGACTCAGAre ^ 
0 P S A V K R T I S G t C A S 0 S E S T L D I T I C R K V 0 R 6 T H F I R R I A 389 

CACCCCTTGCKTTCC^ 

G R A S G P r I G P L V S F L R V It Q P T H V L R T 0 R E 0 G R S T E C T S T T 349 

TTTCTGGTCTTACAATTGGACCTGTAATTAGTT6GATGGCTGTATCTTTCCAGGTCC ^ 
E P R V I P G T I I Q I A T D K W T V T Q M T L R T P 0 T C R 0 L I F L M N T ¥ 309 

CAAA^TCCTGTTGAATCATaAAAAGACAACGCAGGGAT 

F P C T S 0 H L L C R L S T K L G A E 0 R G Y I G I V H L I L « C T P E V I E P 269 
GTTGATACAGTTCCGCAAGTTGATCATCA^ 

0 Y L E A L Q 0 0 L N 6 L L P Q N Y C P F R A N 6 E T G R 0 T A R F P L R 0 M T 229 

"l*™!™™^^^ 103920 
R P D 0 N I G V S V P R 0 R Y 0 Q G G T R P Q Q A S E I S G Y Y V P T A S R ¥ A 189 

^ *^*'"'^^*Jf^"''J" , ''****'**'''*CCTTCCGT AAAAATGATGCGGT AGAGC A TGTTTTGCTT AC ACC AGGGC TCG AGTC TCGGGTCGGTGGTTCTAT AGA ATCC TGTTG AG AGTC ACTTCGTG 104040 
PQOKFFVKRLFSATSCTKTVGPSSDRTPPQISOOQSOSPS 149 

AC TC TGCTGTGGGCTCTC T AGCCGACGATT6A AGGGGCCC AGGGTTTGGTGATTG A ATGGGC TCCCGAC TCGA TCTTGATGTTGGCTGTTGGATGGAC TCCCGACTCGGTCCTGGGCTTG 104160 
EATPERASSQLPGPNPSQIPERSSRSTPQQISERSPGPSP 109 

C™UGUGA™ ,0420, 
P t L 0 I V 0 G P L I 0 I S 0 E F S P E S F G 0 0 0 S P H V E Y E t Y S T 0 S Y 69 

^ A TGC AGG A TGC A T TGC AC TGGACACCGGCAGAGAGGAC ACTGGACGC TGGTGG AGGTCC A TGCCCG AATAC A A AC A A AGC AG AAGTCGTGC A A AC ACGGCA TGGTT TTTCCGA 104400 

1 I H L I S Q V P C R C L P C Q V S T S T H A R I C Y F C F D H L C P M T t G I 29 

GATCWAAACGGTGCTCATGCATATGGT6CA6GTATTATCCGAAGCGTCG6AGGTGCCGCTACCGCCCa 1M520 
DSVTS **CITCTN0SA0STGSG6ALIT0M , 

CAAACACGTAGCAGAACTGCCATGCGTTCTAAATTGTGAGTTGTGGCGAGTACATTTTTATAATTGGTACCAACGAAGACACACCCCTATATCCCTCCACCCATTTCTTTTAAGTCCCAC 104640 
CCACTAAAACGTGGGTATAAAATGTGTATTC<^ 1M760 

TTATTTAGATAAAGAGCGATACGAAGACATTTCTCCACCCCCCTGTAATACCCGTAAATAAAGGTAAGTCCACAAACAAAAGCACTGTATATAGGAAGTCGGGTGTATTGGGACAGn 104880 
UL - — x— - IRL 

TCCATTAGAGGCGTACAAACAATACTGGGATAGGGTAATGCAAGTCCCCCCCGATGGTCGCCCCGCAAACGCGCGGGGAGGTGGGGTCWTTUTTTTTTCTCTCT 105000 
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IRL — -x-"~ IRS 

MMaramttiCTCCC^^ 105120 

nCHIWiCCGnKGCnnmAttUC^ 

6 « V D 0 E 0 E S S S S S S I L E V * V » P B I 6 V R B D B L « 6 P » » S » 1 S 1259 
JCCCC— ^ 

GTGCGCCGCTGTC^^ «» 

T R R Q R G A Q T E S P R E S 6 G H H G F V F R T G D S G E L V T E A A 6 F G P 1179 

GGCCACAHACTCTGGGAATC^ 

p N M V R P I P P t A N R A E 0 A S A Y L A V S R G A V P P L V L P E A A P D P 1133 

TCCAGCAGGGCGTGGCGGCAAAACACCCTCGCCCAGGTGGGTACGTCGCCGGCCTCCGGCCCGGCGGCCCCC6GTCTCC6TCCCK 10*840 

E L L A H R C F V R A t T P V 0 G A E P C A A G P R R G E P L F ¥ P R L A A 6 L 1099 

ttCCATCGGTffGCTKKGGTGGCTATGTGCC6CCTCGTC ™™ 

G I R N A A R H S H A A E D V F 0 A A G L G L G R S Q R A L D K C G 0 F G P L V 1059 

TACTttCGGTATTCACGGGGCGACAGGG^ 1 ^ 

Y Q R Y E R P S L P V R T K P G A R T C V T Y A V N G G R C L C L P Q E G P V t 1019 

TTGGCAAACGCAGTCTCCm^^ 

NAFATEIRAFSAPGFTRSSAFVARALGETVASDPHRVVAO 979 

GC6TCCGGTCGC6CC6GGGCCCGGACGTACACGTGATACTGAGAC^^ ™320 

A D P R A P A R V Y V H Y Q S L A P 0 0 R P 1 R E L A V A 0 L V L L R R R A S A 939 

AKCGCGAKCTAGATACTCGACGGCCCCGGCAAAGGCCAGGTCTC^ 

L R S 6 L Y E V A G A F A L D R T S L L L V C R A H L A S V 0 P P G T * H 6 A « 899 

6CAT6AGTKTCGGCAGGCACAACCGGTCACTCAW ™™ 

A H T S P L C L R N S L A A L V V S L G G R S P S I S P G P 0 T A L T S E L P G 859 

TTGATGTCCTCTCCGGGCAACGGATCGTAGATGATCAGAAGCCTCACATCCTCCG6GTCT6GGATCT6CCGCATCCAGGC6CACCTCC6TCGCAK '"^gig 

H I 0 E G P L P D Y I I L L R V D E P D P I 0 R H « * C R R R t A E V G S P P G 819 

UCCGTCGGTCTCCTCCC^^ 106800 

F R R D G G G P R R A A I E A L A G P D F S L A P R i A T P F L P 0 0 V L R A I 779 



106920 
739 



GTTTCGGGGGTAC A6T AC6CCTTGCG AGCCTGGTCCGAC6GGACCGGG6T ATGC AGGGCCCCCCGGGG A AT ACGCC6AAATCCCCCGTTT6GGGCCGGTCCGTC A AGTGGC ATCGTT ATT 
TEPTCYAKRAQ0SPVPTHLA6RPIRRF6GNPAPG0LPMTI 

ACGGCGGGGGGATCCACCACAGGGCCCGAGGTGATGGTCACGGGCTCGGATACCCGCCTCn ™£ 



107160 
659 



107280 
619 



I U*U»t»uwvv.i.v)»w.«-.-v . . „ .. n ». UD iniUDTCQ 

V A P P 0 V V P G S T I T V P E S V R R K A I S V V H D D A V R A D A Y P T E R 
ATCTTGTC6AG6AGGCTTCTGCTCTCGACTGGCT6GGACTTGCGCTTGCGCGGAGTTCGTAAACGATCATCCGGTGGACACACAGAAAGAGA 

I I 0 L L S R S E V P Q S I R It R P T R L R 0 0 P P C V S L S R A A A S P 0 P 0 
G6AKCT6TGTGGCC6GGGTTGTTGGAGAAGGGT6ACCK 

P A Q T A P T T P S P H G R S I R A A P S S G T A E P Y A M S A F A R R L S Q L 

AT AAAGTGTTTTTGGGCCCGGTCGT ATCGACGGCTC AT AGCC ACG6CCGCGGCCGCGTGGGGG AGAGCCC AG AGGGCCTCCCCCGTGGCC AT6GCTTCGCCT AC A TGCM A ACGGG AGAC 107400 
I F H t Q A R D Y R R S M A V A A A A H P L A 1 L A E G T A M A E G V H P V P S 579 

GCTAC6CTCCCCGTAACGGCGGTACCC&CCC6TCCC6GTGGCAACAGCTTTTGGTAGAACTGGTTCAGGGCCGAGTTGACACCGGTCAGCn 107520 
AVSGTVAT6ARGPPLLKQYFQNLASHVGTLIPNQLWAIP0 539 

CTGTCTGGACAGTAGATCAGGTTAATCAKGCKGGTACTGTCTAGC^ W ™ 
R o P C V I L N I L A R Y Q R A P 0 G L E P V Y L P V P E T S A E Y R A R A A R 499 

AC AGCCTC ATCCTCCC AGTG ACCC TCTC T66TCTCCCCGGAC6GTCC A A ACCGC ACCCTGTTGG ATGG6AGGGGTGCCG ATCCGGGCC A AGGGCTTCCGTCGGGC ATC ATG AGCGGCCCC 107760 

V A E 0 E N H G E R T E C S P G F R V R N S P L P A S 6 P « P S G 0 P M M I P G 459 

GACACC6GGGGAATTATCGGGGnCTGGATCGCGGCAGGGA 107880 
S Y P P I I P T R S R P L S F S K Q R Q R G P E G A L R C T I R I R P 0 P G S I 419 

GATCGATATCCCGGTT6GATATTTTGTnCGTCGACCCACCATCAnTGAG HWOO 
SRYGPQIH0KTSGGDHS0SDDSNSPSPAHER6SRSGTTES 379 
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CCC ACC6AA ACGCGCCGGGOTTCATCG TCT TC A TCC TCCGATGACG ATCCCC AC6ACGAGGAAGAGGATGAAGAC6AA AC A A ACTCACGACTCTn6GCTT WCTCC AC TGGGCTGTC A 103120 
GVSVRRPEOOEOESSSGISSSSSSSSVFERSrPKKEVPSO 339 

TCCTCAATCGGGTCTGGTGCGTGGGATCTTCCCGCCAGGGCCAAAAAC^ 108240 
DEIPDPAHSRGPLALFABPKGGSSRGPVRPTIGRADHIRP 299 

CGGGTATCATCWTCTCAAACGC<AMTaGCCTTTGCCCCCTTAGC4KHiAACaTGTCCGAAAGGACG 108380 
RTD0TEFPLDAKAGCAPVS0SLVHYLQEVP6PVP66PKRA 259 

GGGIGTGGG ACCTTA ACCTTC AAAGTC TT TT TCTTCGGGCTCTTTCCCTGAGCGGGCCGTTG AGTTTTCTG6ACAAC TACTCCGTCCCCC6ATGC ATGCGCA TGACCCGCTTGC TCA TC6 1084S0 - 
PLPVKVKLTKKKPSKGQAPRQTKQLVVGOGSAHANGAQEO 219 

CCCGGCTTTTTACCCGAGATGGACTGAGTTTGTCTGTCTC^ 108600 
6PrrGSISOTQR0RHV¥SPLGPSHGRTT0RPISRQOERQO 179 

TTCCCGGC6GC6TCTCCAACAGGAGAC6CGG^ATACAGGGGAGAAG6CCT^ 108720 
NGAADGVPSAPSVPSFAQPFPPTTGRGTEGDNHPPKPSRA 139 

AGCTTCGTTCCGAGAGAGACTGTCTCAAGGGAGCGATCGGCTCCTGTTWTTCTCGCGCGCCGC<CTCCGAGAATC6W^ 108840 
LKTGLSVTELSRDAGTPERAGAESFRTHFVEALPIVPSGM 99 

AGATCCTGACCGTCCTCGCATACGTAGTCGTCTfGTGTTAGCTCTICGCCAACATCTTCCGTTCTGC^TTCTGGTTGAAGTCCCGATACGGAGGGAATT6AAACGATCTCGTGTTCCCGT 108960 
L0QG0ECVYDDQTIEEGVDETRPEPQLGSVSPISV1EHER 59 

CCCACCATGACCCCGTTCTCTCCAAATAGTAGATCGTCAGGCTGACTCGAGGTGACCACCCGGGCCCTGTGTTCGGCGGCCGCCGtGGCCGCGTCCAACA 109080 
GVMVGMECFLLOOPQSSTVVRARHEAAAAAAOLLOMLELT 19 

TCAGGCG ACCCCGCGCGTTGGGGTGT AGAGCGC TGC ATCGGCGGCGT A TCCA TCGC AC TGGGGTGA AT TT AGACGTACCCGAGTTT TCC AA ACGC TC TCGC AGCCTTC A A AGGATT6CGA 109200 
OPSGARQPTSROMPPTOM 1 

TTGCGGTTGGTGAGGGAGTTCCAACAGTACTTAAAACGTGTTGTGCCCCCCCCTCGACCWATATnCCTCCCCGTGTCGrc 109320 

TTGGACGAGACTCGAGGCC^AAGTTCATGGACCATAGTATGCGTTTAAGGAGAGACC6CTMTTGGCGATGTACGCCC6GTGTCTATTTC^ 109440 

TACCAGACATGTGAATTTCATnACATATGTTTAAATAACAACCAATCATCGTGTGTCTACAGACGATATATAATATACATAAACACUTTGGMTTGTCTCACATGCAAAACATCT 109560 

ATAACACGGGTTGTTTCCACCCATCC^ 1W680 

< 

( A X 

CCCTCCCAACGATTATGTCAGGCGGCACGAAGCCCGCGATAACCCATAAAATACiCAC6GGCTTGT6GT6nCACGTAACCCCCCGCCGAT6GMAGtt 109800 
— reiteration R4 y 

* H « >< ft >< A )< 1 > 

GGAGGWHKttGGTACCCCGCCGATG^ 10992(J 

TTCTCTAAACCGTTGTAGAAAATCACAAAAAAATTTATTCAAAAACAAGTCGAAGAACTTCATATCTGAGKATGTAAACCCGTTCGCACTTCCTCW 110040 

< — — — — — — — . . „ „ 

6GTGUUAGG6GC^iGGGTTAAATT6GGCGTCC6CATGTC^ mm 

origin of DMA replication 

T6GCATGTGTCCAACCACCGnC6CACTTTCTTTCTATATATATATATATATATATATATATATATAGAGAAAGAGAGAGAGTTTCTTGTTCGCGCGTGTTCCCGCGATGTCW 110280 

ATGIHWTGTGGGCG^TTTTCACAGAATATATATATTCCAAAT6CAGCGGCAGGCTTTTTAAAATCGATTT6AC 110400 
TUAGttUCCCAATCGAAGGTCTCCCGCCtt^^ 110520 

MFCTSPATRGOSSESKPGAS 20 
nCATAACGCGACAGCGTCGAGTCMTTTTAAGGGAAAAGGTTACTACGGCCCCAAGGACATGTTTTKACCTCACCGWTACW 110 S40 

VDVNGKMEY6SAPGPLNGR0TSRGPGAFCTPGMEIHPARL 60 
GTTGATGnAACGGAAAGATGGAATATKATCTKACCAGGACCCCTGAACC^ 110760 

YEOINRVFLCIAQSSGRYTROSRRLRRICLOFYLMGRTRQ 100 
GTTGAGGACATCAACCGTGTTTTTTTATGTATTGCACAGTCGTCGWACGCGTCACGCGAGATTCACGAAGATTGCC^WATATGC^ 1 108 80 

RPTLACNEELLQL0PTQTQCLRATLMEVSHRPPRGED6FI 140 
CGTCCCAC6TTAGCGTGCT6GGAGGAAnGTTACAGCTTCAACCCACCCAGACGCAGTGCTTAC6C6CTACTTTAATMAAGTGTCCCATCGACCCCCTCGGG 111000 

EAPKVPLKR5ALECDVSDDGGEDDS0DDGSTPSDVIEFRD 180 
GA«^GCC6AATGTTCCTTTGCATAGGA6CGCACTGGAATGTGACGTATCTGATGATGGTGGTGAA6ACGATAGCGACGATGATGGGTCTAC6CCAT 111120 

SOAESSDGEOFIVEEESEESTDSCEPDGVPGOCYROGOGC 220 
TCCGACGC6GAATCATCGGACGGGGAAGACTTTATAGTGGAAGAAGAATCAGAGGAGAGCA^^ 1112 « 



lCCCCCGI0ACICCCICCUTC0CGTHCC»WCTCITCTTCC0TATCt6U6»nCCM0ICCTC6 112320 
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^gaIt.™^ 

180 

n6CttCtSIlCGGTG»TC»CCGGGKn»TCT»T»UC»CCGTGTUCTCGUI 
m «tt«GCTC tt GG6GTTACCAC^^ 

IMHMIUEIlKlltfEIUlf H 

JinUHttiCttCTTTtMTJTnOTITttMttMMTO 112*00 

N 1 

CCICnnAATtGG.TCACTtTATUGTGTGTC6GTGA6IATAnn6TACAGn 6 n6GACAACAG6imiGG1tCATTAK^ 

^tkaaca^ 
JgtI^gIcaIagIa^ 

I,g!,«a1!t«aa™^^^^ 

(iK666TCHC6TCH6GAACGGttACC^^ 113520 
««nIc«»K.GU»UC1GtmC»GCT«««^ 

tTAT1tACCGT6AHTAAAAICT6MAATAIAniATtAACCK "™ 

. . « , . . i y < » F L L A R 0 P » 6 P A V 0 I I S A 6 I » I F E H A T G 0 N S 2«1 
GGG^AKAAKG^A^ 

LcaVacaI™.^ 

: ACKACGICAICWCCAtCASCAGAGGlGnGtTTAACCK 1H240 



VZV DNA sequence 1801 

623 

Ugt^t™™^ 

(0^»naTiCCiaCG6ttCflI6IHi86»tCCCGT66OTUIW 1W4 ° 

| IgILkc—^^^ 
' cm«GGT«Gcocmc a ™^ 

„C«CG««CCCC^^ 

06PV60PECSDTSEESEtt»i»*ucwu 

cLtLt^cIttcc^ 
1gtc«^ 

• < 

OGGUGTGCGUCGGGTTTK^^^ 

reiteration M 

UCCGCGCCCCCTttCCnCGGCGGGGUCCGCGCC^ 

- > 

kccc^g6ggg™g»kkc^^ 

TKGCGUCGTHTGKGUGaG*^^ 
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~»™««oggcghutc^^ 

AU»m«C«GGT M «»C«GGGG««^ 

««™«""TCGGGT«GTCTum^ 

D * IAA * AEHR * RVVT 5S0l , ODLLF6ENGVMvrDcu C T»*. 
™^^«™«C«GGCCC^ 

»«c^«cLU*g™^^ b » 
™"~«^™ J* 

""GGCCcUgIgC^^ 

«»!o« 

GREH»PSPSHS00S0SKDGGSTtQNI0P6YR-:i«fon.., ,« 
™<«0UC«GCCCCTTC^^ 

G^LcUcaG^^ 
««i«*TG«ca*TGTU^ 

™«««gg«tIcc^ 

RV»D0H»VSr*KRRVSEPVTIT$6P»vnpp.viT U »,„,. 
CCGGGnGC»G*C6*IC*T6IGGITreCA*GGCO*G*GKGGGinCCGAI^GIGTO*rCKCTCGGGCtCIGIGGIGGlICCCCCtGCC 123000 

*PHG6FRRIPRG*LHTPVPS00iRtiYrTPCTiii». « « * „ 
GKCCaUCGGGGGmiCG^^ 

»TTGTnCa»CG6CCIGGC«ICTGCGCT*»GClTIG*TCCCGG^^ ^JJJ 
«™««««T«Gt\^^ 



1803 

VZV DNA sequence 

r»c»wF¥L6SRLASARRRLL 946 

r.ou*iPPlPRVM>PP6F6A*ET 1186 

uri.cctisSSEOEOOVlGGRGCRSPP 1306 

1310 

Js/stc— .OTCCcaccnn™^ 

124884 

Fi , , Seances of .he -V = e ^ 
TheIR u -IRsj»nction would a.so^^ 

encoded protein sequences are shown in single letter ^amino aciu designated by 

sequence Leftward encoded proteins are sh °" n 

„umbera.,he.ertofthefirstl,neco„tan,.ngth^ function « transcrjpt 

DNA sequence will be deposited in the EMBL sequence library. 



identical in two independent overlapping clones (K P nl * * 

recognized region of size variability between gn« J -J. 1»3^ ^ 

the^^^^ 

or very infrequent in occurrence. Fo^^^ 

base pairs in the U s compone n .^"^ ^2^'™^ in size. The extent of genome 
results described above imply that the VZV genome is nmumq reiterations, 

size variability among different v.rus .solates, the ^NA ^"^^ 126000bp . 
indicatethatthegenome^ 



12 3 



mm 



VZV DNA sequence 
B C 

12 3 12 3 
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622- 



492- 



404- 




-622 



-492 



-404 



, • r V7v ^d„v mRisiA The three end-labelled probes used were (A) 881 

^^f^^'f^^V^^S^^^MA tad to totairf will, 
cell mRNA Lane 1 contained one-tenth the amount of probe represented in lanes 2 and 3.1 he sizes oi 
DNA^arke^ 

infected cell RNA are indicated by arrows. 

proposed for some genes whereas no good candidates may be found I far^ta* Thus, the 

^y^nnSffaMod agreement with the experimentally determined value of 

ends of the dPyK mRNA, and their locations with respect to the DNA sequence F.g.la« 
ndiated lb the regln of gene 36. The SI nuclease results indicate that the 5' end of the :mRN ! A 
mapslJ wEZa from the Accl site (Fig. 4, lane A3) and ™%*^ f ™£*%} 
SFig. 4, lane C3), consistent with the function of the sequence TATTAAA underhned m Fj 
1 (64364) as the TATA element. Three similar sequences (TATATTA, TATAAAA ana 
TATAATA in Fig 1 (64433, 64469 and 64611) are present between this element and the 
lultio^codon fo'dPyK. Thus, the location of the 5' end of dPyK mRNA could not have been 
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predicted from the DNA sequence alone. Similarly, the 3' end of the mRN A might have been 
expected to map near the AATAAA or ATTAAA elements underlined in Fig. 1 (66125, 65978) 
In fact, the 3 end is located 580 bp downstream from the Xmal site (Fig. 4, lane B3), and is thus 
denned by the AGTAAA element (65859). This result counsels caution in predicting the 3' ends 
of other VZV mRNAs; nonetheless, the general degree of confidence was sufficient 
tor proposed elements involved in polyadenylation to be included in Fig. 1 and 2. In any case 
!io? t rC P |ecede » ts for the functi ° n of AGTAAA in polyadenylation (Donehower et al, 
1981 ; Tamura et al., 1981 ; Capon et al., 1983). The possibility that the dPyK AGTAAA re- 
sulted from a base change in an AATAAA during cloning was ruled out by sequencing this 
region in an independent clone (Sstl h). The element AGTAAA has also been proposed in 
Fig 1 and 2 to function in polyadenylation of the transcript from gene 28, which encodes the 
DNA polymerase. 

The dPyK gene is interesting in other respects. Although it encodes a protein with substantial 
homology to HSV-1 dPyK, as described below, the untranslated 5' region of the mRNA is 
considerably larger, at 420 bp, than that of the HSV-1 mRNA, at 110 bp (McKnight, 1980) 
Moreover, the untranslated 5' region of the VZV gene contains two ATG codons in different 
reading frames from the initiation codon for dPyK. Each of these is followed within a short 
distance by a termination codon and, in view of the work of Kozak (1984), this structural aspect 
may affect translational efficiency of the mRNA by requiring reinitiation for expression of 
dPyK. As the only VZV gene promoter identified experimentally to date is that for the dPyK 
gene responsible comparisons between HSV-1 and VZV promoter regions may be made only 
lor this gene. Little similarity was detected between regions upstream of the TATA elements 
The differences between the structures of the promoters and untranslated 5' mRNA regions of 
the VZV and HSV-1 dPyK genes suggest that transcriptional and translational control might 
differ considerably between the two genes. 

The locations of potential polyadenylation sites near the 3' ends of ORFs are summarized in 
Fig. 2. Many genes apparently possess unique polyadenylation sites, whereas others are present 
in 3 -cotermmal families containing up to four genes. The 3'-coterminal gene arrangement has 
been well-<haracterized in HSV-1; where sets of genes are expressed as mRNAs with unique 5' 
ends but shared 3 ends (for review, see Wagner, 1985). Thus, the mRNAs expressed from genes 
towards the 5 end of a 3'-coterminal family contain extensive untranslated 3' regions The VZV 
genome contains 216 AATAAA elements, but only 48 unique potential polyadenylation sites are 
a^*'! V" J g ' 1 and 1 Moreover > sev en of these sites contain ATTAAA rather than 
AATAAA, and two contain AGTAAA. It is possible that the presence of some AATAAA- 
related elements close to the ends of ORFs is merely fortuitous, and thus some genes shown with 
unique polyadenylation sites in Fig. 1 and 2 actually may be members of 3'-coterminaJ families 
This comment applies particularly to VZV mRNAs predicted to have unique 3' ends whose 
HSV-1 counterparts are polyadenylated as part of a 3'-coterminal family. For example HSV-1 
transcripts corresponding to VZV ORFs 40 and 41 are 3'-coterminal (Costa et al., 1981) as are 
those corresponding to ORFs 43 and 44 (Costa et al., 1984). Ostrove et al. (1985) recently 
reported a transcript map for the VZV genome, based on Northern blot analysis using relatively 
large cloned DNA fragments. The arrangement of VZV mRNAs suggested by this approach 
correlates well with that deduced from the DNA sequence in some regions, but there are 
apparent discrepancies in others. Confirmation of the transcript map and resolution of these 
differences must await the mapping of mRNA termini. 

Although identified overlapping polypeptide-coding regions are not extensive, the VZV 
genome shows considerable economy in its gene layout. Almost the entire sequence encodes 
virus proteins, and it is likely that many regions involved in control of gene expression are 
located in the coding regions of adjacent genes. However, there are four notable regions for 
which no protein products are predicted. One is located at the left end of the L segment and 
contains TR L . It may contain sequences which promote cleavage of full-length genomes from 
concatemers produced during DNA replication (Davison, 1984). The second region, also about 
600 bp in size, is located at the right end of the L segment and contains IR L . Part of its function is 
likely to be as a promoter for gene 61. The third region is about 1400 bp in size and is located 
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between genes 60 and 61 . It contains an unusual direct repeat of 88 bp, with three mismatches, 
separated by 24 bp (102020-102219 in Fig. 1 ; A indicates each repeat). The function of this 
structure is unknown ; it could form part of a control element for gene 60. The fourth region is 
about 1400 bp in size and is present twice in the genome : in IR S between genes 62 and 63 and in 
TR S between genes 70 and 71 . It contains the reiteration R4 and the promoters for the two genes 
on either side. It also contains a palindrome (Davison & Scott, 1985) which forms part of a 
functional origin of DNA replication (Stow & Davison, 1986). In view of the compact 
arrangement of the rest of the genome, these four regions are likely to have important functions. 
Although no protein products have been assigned to them, it is possible that some contain small 
coding exons or perhaps larger non-coding exons. Alternatively, they may encode functional 
RNAs which are not expressed as proteins. A third alternative is that they may encode no RN A 
or protein species, but are sites for specific recognition during the virus life-cycle. This is 
certainly the case for the origin of DNA replication, and probably for sequences at the left end of 
the genome potentially involved in DNA maturation. 

$\ VZV gene function 

\ Comparison of the proposed arrangement of VZV genes in Fig. 2 with published HSV-1 
transcript mapping data (for review, see Wagner, 1985) indicates that both viruses have a 
similar gene layout. This view was confirmed by available HSV-1 sequence data, and allowed 
the functions of several VZV genes to be assigned on the basis of primary amino acid sequence 
homology of their products to HSV-1 proteins. These conclusions, and the precise locations of 
VZV genes and molecular weights of their primary translation products, are summarized in 
Table 1. Genes encoding glycoproteins, homologues of HSV-1 immediate-early proteins, and 
proteins with extreme properties of hydrophobicity, hydrophilicity, charge or amino acid 
composition are also indicated. Allbu tthree of the functional assignments were ma deonthe 
basis of HSV-1 gene location ami confir med by amino acid sequence homologywith HSV-1 
p roteins. T he dU T Pase was assigned oaTft Tbasis oi the location of the HSV-1 gene reported by 
Preston & Fisher (1984). The thymidylate synthetase and protein kinase genes were located on 
the basis of amino acid sequence homology of their products to proteins of known function in the 
NBRF protein database. Approximately 30 VZV proteins are hom ologous to proteins predicted 
from the complete EBV sequence determined by Baer et a !. (1984); the implications of this result 
in predicting the functions of EBV genes will be discussed elsewhere (A. J. Davison & P. 
Taylor, unpublished data). 

Fig. 5 shows examples of homology between VZV and HSV proteins displayed by optimal 
alignment of predicted amino acid sequences. Fig. 5(a) shows a comparison of the product of 
VZV gene 18 with the small subunit of the HSV-2 ribonucleotide reductase; these proteins are 
highly conserved. The lower degree of homology between the VZV and HSV-1 dPyKs shown in 
Fig. 5{b) is in accord with the DNA hybridization data of Davison & Wilkie (1983), who were 
able to detect conservation of the ribonucleotide reductase gene but not of the dPyK gene. The 
degree of homology shown in Fig. 5(c) between the product of VZV gene 5 and the potential 
HSV-1 membrane protein reported by Debroy et al, (1 985) is about the same as that observed for 
the dPyKs. However, several pairs of genes are less conserved than this, and only specific 
regions of the proteins were detected as being conserved by this approach. Fig. 5(d) shows the 
conservation of a region towards the carboxy termini of the glycoprotein product of VZV gene 
14 and HSV-1 glycoprotein C (gC). Although the homology in this region is significant, the 
major part of each protein is divergent. The divergent region of the VZV protein contains a 
repeated amino acid sequence coded by reiteration R2. 

Most glycoprotein genes encode primary translation products with distinct characteristics: a 
hydrophobic signal sequence near the amino terminus for translation of the mRNA on 
membrane-bound ribosomes (Blobel, 1980) and a more extensive hydrophobic region followed 
by basic residues near the carboxy terminus for anchoring the protein in the membrane (Tomita 
& Marchesi, 1975). The VZV genome contains five such genes: 14, 31, 37, 67 and 68. Genes 14, 
31, 37 and 68 are counterparts of identified HSV-1 glycoprotein genes, as shown in Table 1. 
Gene 67 is the counterpart of an HSV-1 gene whose predicted product also has the 
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Table 1. Properties of proteins coded by predicted VZV genes 



Gene 


Start* 


Stopf 


Residues 


Mol. wt.J 


Extreme properli 


1 


915 


592 


108 


12103 


Hydrophobic (C) 


•> 


1 1 1A 
1 1 J*4 






25983 




3 


2447 


1911 


179 


19149 




4 


4141 


2786 


452 


51540 


Hydrophilic (N) 


5 


5274 


4255 


340 


38575 


Hydrophobic 


6 


8577 


- 5329 


1083 


122541 




7 


OOU/ 


"JO J 




28245 




8 


10667 


9480 


396 


44816 




o 




I 1 QtA 

I I V I** 




J to** J 


nyurupniin 


10 


12160 


13389 


410 


46573 




11 


13590 


16046 


819 


91825 


Hydrophilic & acidic 


12 


16214 


18196 


661 


74269 




13 


18441 


19343 


301 


34531 




14 


21113 


19434 


560 


61350 




IS 


22478 


21261 


406 


44522 


Hydrophobic 


16 


23794 


22571 


408 


46087 




17 


24149 


25513 


455 


51365 




18 


26493 


25576 


306 


35395 


Acidic 


19 


28845 


26521 


775 


86823 




20 


30475 


29027 


483 


53969 




21 


30759 


33872 


1038 


115774 




22 


34083 


42371 


2763 


306325 




23 


43138 


42434 


235 


24416 


Hydrophilic; S, T, Q-i 


24 


44021 


43215 


269 


30451 


Hydrophobic (C) 


25 


44618 


44151 


156 


17460 


Hydrophilic; acidic {I s 


26 


44506 


46260 


585 


65692 




27 


46127 


47125 


333 


38234 


Hydrophilic & basic ( 


28 


50636 


47055 


1194 


134041 




29 


50857 


54468 


1204 


132133 




30 


54651 


56960 


770 


86968 




31 


57008 


59611 


868 


98062 




32 


59766 


60194 


143 


15980 


Hydrophilic & acidic 


33 


62138 


60324 


605 


66043 




34 


63910 


62174 


579 


65182 




35 


64753 


63980 


258 


28973 


Basic 


36 


64807 


65829 


341 


37815 




37 


66074 


68596 


841 


93646 




38 


70293 


68671 


541 


60395 




39 


70633 


71352 


240 


27078 


Hydrophobic 


40 


71540 


75727 


1396 


154971 




41 


75847 


76794 


316 


34387 




42 


78038 


76854 


395 \ 


82752 




45 


82593 


81538 


352/ 


(spliced) 




43 


78170 


80197 


676 


73905 




44 


80360 


81448 


363 


40243 




46 


82719 


83315 


199 


. 22544 


Hydrophilic & acidic 


47 


83168 


84697 


510 


54347 




48 


84667 


86319 


551 


61268 




49 


86226 


86468 


81 


8907 


Hydrophilic 


50 


87882 


86578 


435 


48669 


Hydrophobic 


51 


87881 


90385 


835 


94370 




52 


90493 


92805 


771 


86343 




53 


93850 


92858 


331 


37417 




54 


95984 


93678 


769 


86776 




55 


95996 


98638 


881 


98844 




56 


98568 


99299 


244 


27166 


S, T-rich 


57 


99626 


99414 


71 


8079 


Hydrophilic & basic 


58 


100272 


99610 


221 


25093 


Hydrophilic & basic 


59 


101219 


100305 


305 


34 375 




60 


101649 


101 173 


159 


17616 


Acidic 


61 


104485 


103085 


467 


50913 


Hydrophilic 


62 


109133 


105204 


1310 


139989 




63 


110581 


111414 


278 


30494 


Hydrophilic & acidic 


64 


1 1 1 565 


112104 


180 


19868 




65 


112640 


112335 


102 


11436 


Hydrophobic (C) 


66 


113037 


114215 


393 


43677 





Function]] 

Homolo gue of HSV-1 IE63 1 
dUTPase 

/row-inducing factor 2 



Thymidylale synthetase 3 
Glycoprotein (gpV); homologue of 
HSV-1 gC 4 



Small subunit of 

ribonucleotide reductase 5 
Large subunit of 

ribonucleotide reductase 6 



DNA polymerase 7 

Major DNA-binding protein 7 

Glycoprotein (gpll); homologue of 
HSV-1 gB 8 



Deoxypyrimidine kinase 9 
Glycoprotein (gpIII?); homologue of 
HSV-1 gH'° 



Major capsid protein' 1 



Exonuclease' 2 



Homologue of HSV-1 IE175' 3 
Homologue of HSV-1 1E68 U 



Protein kinase' 5 
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characteristics of a glycoprotein (McGeoch et al., 1985). The ^in produ^ of VZV 
S an 6 hi. been identified unequivocally: they encode gpll (Keller 
^(Davison et a,., 1985) and gpl (Ellis et al., 1985) , respect.vely 
iUhedVZV^^ 

?rDebroyT«/-, 1985). Thus, it is possible that VZV gene 5, and perhaps genes 15, 39 and 50, 

arrangements are similar, there are limited regions of 

ficanfdSeren^ ihe mosfextlnsive encompasses the S segment, which in HSV-1 contains 
signincamamerencc. mi* contains only seven. The 

■I 13 unique genes (McGeoch et al., 1985, ^ an ° , h b discussed separately 
•relationship between the S segments of VZV and H!>vi nas Decn i hut the 

: Davison & McGeoch, 1986). In summary, each VZV gene has a homology in HSV-1 but he 
, JThSV-1 genes have no counterparts in VZV. The 'missing' genes include the 
2££w^ which encodes HSV-1 IE, 2, and the gene which encodes glycoprotein D. 

end. Also, the inverted repeats flanking U L are much larger m HSV-1 J^JJJ^^™^ 
than they are in VZV (88-5 bp), and the single gene thus far idenUfied from the ^HSV ? 1 sequence 
of this region specifies a spliced immediate-early mRNA encoding IE110 (L. J. rerry, r. 
Ril & D J McGeoch" personal communication). At the present stage of analysis no 
SoL of this gene has yet been detected in VZV. Thus, the differences in gene arangement 
SeTvZV and HSV-1 in the S segment and at the ends of the L segment result in VZV 



67 114496 115558 354 39362 

68 105808 117676 623 69953 

69 118332 117793 180 19868 

70 119316 118483 278 30494 

71 120764 124693 1310 1 39 989 



Hydrophilic & acidic 



Glycoprotein (gplV); homologue of 

HSV-1 US7 14 
Glycoprotein (gpl); homologue of 

HSV-1 gE 14 

Homologue of HSV-1 IE68" 
Homologue of HSV-1 IE175 13 



• Location in rightward strand of first base of initiating ATG codon. All except 14, 31 and 68 refer to the first ATG in the ORE 

Referencesare given ^Mc^ « * '* 

assignment of VZV gene function L * p f r ^ * ^/ H984) 5 Y Nikas & J- B. Clements, personal communication. 
Davison & R. W. Honess, unpublished ^-^^f!^ 1 ^ J fl / (19M) 'McKnight (1980) '°U. Compels & A. C 
♦McLauchlan & Clements (1983). ^ "Draper^/. 

<»^^ (1985 >- ,$McGcoch & Davis ° n ° 986fl) - 
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I HOOKDCSHFFYRPtX'PtHNHLKALS I SNRWLfcSUFI I K I >1) Y Q Y LOCL.T £U b'L ! FYRFIFTFUAADDLVNVNLGSLTOLPSOKOIHHrY I tOtCl tWMAD 

1 nOPAVS PAST!) PLOT H ASGAGAA PIP VC CT P £ R Y FYTSQCPD 1 Nl 1LKS Lt> 1 LM RWLLT K L VF VG 0££ O VS K LS LGULG FY RFLF AFL6 AADOL VTEMLCCLSGLP KOK 01 LH Y Y VbQE C 1 E WHS I 

103 VYSOIQLMLFRCDt&LRVOYVNVTINNPS (OOKVQWI,££KVftL»IP£tVAKKY | Ull 1. 1 £GI FFVSSFAAI AYLRNKGLFVVTCOFNULI&RDEAIHTSASCCI YMNY VPCKPAITR IHOLFSEAV 

* • • • *.. ..... .. • .» ... 

127 WMlOLVLFKMNDOARRAYVARTINHPAlflVKVBWLCARVRttCOSVPb^ 

226 fcl ECAFLKSHAP KTHLVNVUA1 TQYVKFSAURLLSAiNVPKLFNTPI'PDSOFi'LAFNI AUKNTNFFLRIISTSYAGTVINDL 
253 LU£IGF1RSUAPTO$SIL£PGALAA1KKYVRFSADRXLGL1HMOPLYSAPAPUASFPI^ 



> HiTUKTDVKM UVI.MI YU^YtllGKTTAAhbFLMIU'AITPNRIlXICtPLfYHRHLAGeOAICGIYCTOTKRXMCUVSPLUAOflLTAHr 

• • a a. . * a aa ... * *..*.* • • 

I nASypCHQHASAFWAAHS^UjHNNHKTAJ.kPKKUOKATtrVRL.L'OKMPTIXHVY I OGPllGMGKTT TTUU.VALGSKOUI VYVPCPHTYWttVUSASrr I AH I YTTOHRLDUGL 1 ItACUAAWHTSA 

90 OS».FCSPttAIMHAKISAXMDTSTSDLVOVNKtPYRlKl. SDMIMH ASTI CFPUiRYLVGllMSl'AAM^LLFTLPAF.PPGTNL^^ 
125 OITHCHPYAVTUA VLAPHlGCtACSSHAPPPALTLI FURHPt AAULCYPAAMYLHCSNTPOAVLAFVAM PFTLPCTNl VLXiALPCORMIORlJUlRORPG£RIJ>LAMLAAI RRVYGLLAHTVIt 

21% FL KTMNM)IAUWNTLSFCHUVFKuKX.OKSti: I KI.RKVPG I tOTl.FAVLKl.l'tLCGKFGN 1 LfLWAWUKL'TLSNCS KSMS PF VL5LE0T PQH AAQELJCTLLPQM TPAHMSSCANN I LKtLV 

248 YLQGGCSWRLDWGULS CAAVPPUCAkPgSNACPH PHlCUTLFTLFKAPKLUUWGULYNVFAWAUJVLAJlRlJtPHHVFlLOY GS 1 PT I COL ART F 

334 MAVOUNTS 
369 AKLMGtAN 



ic) 



1 MOALCIKTLHFI 1MCLL3GHAVFTLWYT ARVKFCH LCVYA TTVI NGGPVVWG SYNNSL I YVTFVHHSTFLtWLSGYUYSCRENLLSCOTNVKTAISTPLHOKI Rl VLGTHHCHAYFWCVQLK 
1 HLA VR^LgilSSTVVLlTAYCLVLVWYTVFGASPLHKCIYAVMPTCNNNUTAL.VWMKHNUTI.LFU;APTf1 I'PNCUWKNHAIIISYANLI AGRWPF0VPPOATNKRIHNVHKAVNCL.ETLHYTIIVII 

123 HIFFAWFWGrtYLQFHRIKRW'CPFK&SCeMSPTSrSL^YVTHVlSNIU^Ym MHKC VTLLMLLCV I AHI SSGC 1 VLLTLC VA YTPCALLYP 

• a ■ * aaa * * aaa ■ 4 . aa.aa **a* * * * .* ■■■ ■« * a . ». ■ aa a . 

1 24 LVWGWFLYLAFVALHOHKCHFGWSPAHKHVAPATYLLNYAGRt VSS VFLOYPYTK I TKlXCtLS VOkQNLVOLFETUPVThLYHRPA ICVI VCCCL 1 V RFVAVCL I V GTAF I SRGACAlTYF 

24 7 TYlKlLAWVWCTLAIVtLISYVRPKPTKWKLNHINTG GI Rtf ICTTCCATVMSG1.A I KCKY I VI FA I AVVI FHrt YEQRVQVSLFGESEKSOKH 
249 LFlTlTTWCFVSTIGLTELYCl LRRCPAPKHADKAAAPGRSKCLSGVCGKCCS 1 1 LSG I AMRLCY 1 AVVAGVVLVALHYF.QK IQRRLFUV 

id) 

3D I HVVNTKTKWVlSEHiilT VTTYYRPNI TVVGDPVLTGOTYAAYCNVSKY YPPHSVUVNWTSHFGM1GKMF I TUA IQKYANGLFSYVSAVRI POOKQMOYPPPAlOCWVLW! ROGVSNMKYSAVV 
24 1 H YTf LA WG RMOS PME YGT WV R VRMF RPPSLTLOPH AVHtGGP F KATCT AAA Y Y P RN P V E t'DWF C ODRQ VFN PGO I DTQT H tH PDGFTT VST V TSEAVGGOVPPRTFTCtiHTWHAUSVTFSMRMATG 

434 TPUVYPFPHVSIGI lOGtll VCTAKCVPRGVVMFWpWKOSPIMHENSLITCVCWJMKRTVNHQSSCPTStLLiGPlTYSCHUWYPKKFP P F S A VYT YOAST Y ATT FS V VA V 1 ICV1SIUGTL G 
366 LALVLPRPTlTMEFGVRJtVVCTACCVPKC VTFAWFUGDUPSPAAKSAVTA(J*:SCUtlFGLATVRSTLPI S Yl) VSEY ICRXTGYPAGl PVLtHHGSHOPPPRUPTtROVI F.AIF.WVCI GIGVLAAO 

54 7 UAV1ATLC1RCCS 

490 VLWTA1 VY VVKTSQS RQRIifl R 

Fig. S. Optimal alignment displays of the predicted (single-letter) amino acid sequences of (a) the 
product of VZV gene 18 and the small subunit of HSV-2 ribonucleotide reductase (McLauchlan & 
Clements, 1983); (6) the product of VZV gene 36 and HSV-1 strain 17dPyK (D.J. McGeoch, personal 
communication); {c) the product of VZV gene 5 and the potential HSV-1 membrane protein described 
by Debroy et at. (1985); (d) the product of VZV gene 14 and HSV-1 gC (Draper et at., 1984). In each 
example the VZV sequence is shown above the HSV sequence. 



remarkable degree of homology to prokaryotic and eukaryotic thymidylate synthetases (A. J. 
Davison & R. W. Honess, unpublished data). Instead, Frink et aL (1983) have shown that this 
region of the HSV-1 genome contains a small gene which is present in a 3'-coterminal family 
with the gC gene (the counterpart of VZV gene 14) and is thus in the opposite relative 
orientation from VZV gene 13. Thus, although VZV and HSV-1 are very similar in gene layout 
in the L segment, this discovery enhances the possibility that one or more other local differences 
may exist. 

The VZV and HSV-1 genomes also differ in another functional aspect. The region between 
the HSV-1 DNA polymerase and major DNA-binding protein genes contains a large 
palindrome (Gray &Kaerner, 1984; Wellere/a/., 1985; Quinn & McGeoch, 1985) whichforms 
part of an origin of DNA replication (Weller et at., 1985). This origin is termed ori L to 
distinguish it from ori s in TR S and IR S . Plasmids containing the corresponding region of the 
VZV genome do not contain a palindrome and do not function as origins (Stow & Davison, 
1986). Comparison of cloned and virion DNA fragments (data not shown) has ruled out the 
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possibility that a similar palindrome might have been deleted during cloning, as occurs in clones 
containing HSV-1 on L . Therefore, although VZV has an origin corresponding to HSV-1 ori s 
(1 10087 to 1 10350 and 1 19547 to 1 19810 in Fig. 1 ; Stow & Davison, 1986), it does not possess a 
second functional origin in a location equivalent to that of HSV-1 or/ L . However, it cannot be 
ruled out that VZV has a second origin elsewhere in the genome which may have been deleted on 
cloning. 

Significance of the VZV sequence 

The DNA sequence provides a firm foundation on which to build a detailed understanding of 
VZV infection at the molecular level. This knowledge may be applied in the development of 
effective treatments for the diseases caused by this virus. The sequence has also given the first 
complete view of gene layout in a member of the Alphaherpesvirinae, and has allowed our 
knowledge of the proposed functions of VZV genes to increase from almost nothing to equal that 
of HSV-1. Clearly, the sequence will be important in determining the functions of the majority 
of VZV genes whose role in virus growth is at present unknown. The way in which data from one 
W*L herpesvirus may be so usefully applied to another thus encourages herpesvirologists to cultivate 
a more catholic approach towards the family of viruses they study. 

The authors are grateful to John Subak-Sharpe for continuing interest in this work, to Duncan McGeoch for 
advice and discussion, to Philip Taylor for extensive help with computer programming and to Catherine Joyce for 
supplying the E. coti strain which overproduces the Klenow fragment of DNA polymerase I. Thanks are due also 
to Duncan McGeoch, Frazer Rixon, Lise Perry, Nigel Stow, Mike Dalrymple, Yannis Nikas, Ursula Gompels 
and Ed Wagner for making available unpublished HSV-1 data, and to John Subak-Sharpe and Duncan McGeoch 
for critically reading the manuscript. 
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